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Abstract 

Low-latency  communication  in  large-scale  multiprocessors  requires  high- 
performance  interconnection  schemes.  Multistage  interconnection  networks 
with  redundant  paths  combine  high  performance  with  fault -tolerance,  but 
exact  evaluation  of  the  blocking  probability  of  interconnection  networks  with 
redundant  paths  is  expensive.  Equations  for  the  blocking  probability  and 
throughput  of  multistage,  multipath  interconnection  networks  are  derived. 
A  method  of  approximate  solution  of  the  equations  is  presented,  with  a 
derivation  of  error  bounds  on  the  estimated  solution.  A  program  that  solves 
the  equations  exactly  and  approximately  is  presented. 
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Chapter  1 

Introduction 


1.1  Background 

1  lie  realization  of  low '-latency  communication  in  large-scale  multiprocessors 
requires  high-performance  interconnection  schemes.  Both  direct  and  indirect 
networks  are  examples  of  these:  here  our  focus  is  on  self- routing,  multistage 
networks,  hot  It  with  unique  paths  and  with  multiple  (redundant)  paths. 

One  popular  measure  of  t  lie  performance  of  a  multistage  interconnection 
network  is  its  banehriellb  or  through /ml  that  is.  the  expected  number  of  mes¬ 
sages  it  delivers  in  each  cycle,  where  the  inputs  have  some  given  probability 
of  generating  a  message.  A  related  measure,  from  which  the  bandwidth 
may  be  calculated  in  some  models,  is  the  /irolxibihl i/  of  sue-e-e  ssful  messeige 
transmission  or  the  normalize  (I  thremejhjmt  the  probability  that  an  arbi¬ 
trary  message  at  an  input  is  not  blocked  (and  presumably  queued  for  later 
service)  by  some  other  request  in  the  course  of  delivery.  The  problem  of 
calculating  the  probability  of  successful  message  transmission  is  more  often 
referred  to  in  the  telephone  switching  literature  by  its  complement  the 
blocking  yrobabilit t/.  and  we  shall  do  the  same  here. 

The  problem  of  computing  blocking  probabilities  in  regular  variants  of 
unique-pat  It  multistage  interconnection  networks  has  been  extensively  stud¬ 
ied.  These  networks  were  called  Banyan  networks  by  (Joke  and  I.ipovsky 
[?)].  Patel  [21]  and  Kruskal  and  Snir  [15]  in  particular  presented  expressions 
for  the  probability  of  successful  message  transmission  of  delta  networks, 
which  are  a  particular  regular  variant  of  Banyan  networks.  Multiproces¬ 
sors  have  been  built  using  such  regular  Banyan  networks  for  interconnection 
[22.  21].  A  later  chapter  presents  a  method  for  calculating  the  exact  block- 
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ing  probabilities  ol'  unbuffered  Banyan  networks  that  applies  not  only  for 
delta  networks,  but  in  the  general  case  of  any  unique-path  network.  The 
met  nod  applies  where  sources  generate  messages  with  different  probabilities, 
as  well  as  where  different  destinations  have  different  probabilities  of  having 
messages  addressed  to  them. 

However,  precisely  because  Banyan  networks  are  unique-path  networks, 
they  are  not  inherently  fault-tolerant.  The  failure  of  a  switching  element 
will  necessarily  cut  off  communication  between  at  least  one  message  source 
and  one  message  sink  in  the  network.  A  scheme  that  allows  replacement  of 
failed  components  with  idle  spares  must  be  used  to  maintain  connectivity. 
This  is  the  approach  used  in.  e.g..  the  e  xtra-steige  rubf  network  [)].  or  in 
the  dynamic  redundancy  network  [Id],  both  of  which  emulate  a  (Banyan) 
indirect  cube  network  and  provide  fault- tolerance  by  reconfiguring  in  the 
presence  of  faults. 

An  alternative  to  the  maintenance  of  idle  spares  is  to  make  active  use 
of  the  '•spares"  to  increase  bandwidth,  by  building  a  multipath  network. 
Here  we  mean  that,  in  the  course  of  normal  (fault-free)  communication,  the 
redundant  paths  are  used  in  routing  packets  to  their  destinations.  Some 
examples  of  these  are  the  augmented  delta  network  [8].  the  niultibutte  rfly 
network  [2b].  and  the  merged  delta  network  [23]. 

Both  fault-tolerance  and  performance  can  be  enhanced  with  the  addition 
of  multiple  paths.  Unfortunately,  multipath  networks  create  problems  for 
the  traffic  theorist.  In  a  Banyan  network,  if  one  assumes  messages  at  the 
inputs  are  generated  by  independent  processes,  the  presence  or  absence  of 
messages  at  the  inputs  of  any  switch  in  the  network  is  independent  of  the 
presence  or  absence  of  messages  at  the  other  inputs  of  that  switch.1  Thus 
the  analysis  of  blocking  probabilities  in  Banyan  networks  is  simplified,  and 
polynomial-time  algorithms  exist  for  calculating  the  exact  blocking  proba¬ 
bility  [14].  When  multiple  paths  are  allowed,  independence  is  violated. 

The  author  has  found  in  the  literature  no  polynomial-time  algorithm 
that  calculates  the  exact  blocking  probability  of  a  multipath  network,  nor 
any  proof  that  the  problem  is  VP- or  #P-hard.  The  method  described  in  a 
later  chapter,  for  synchronous,  packet -switched  multipath  networks,  requires 
the  solution  of  a  number  of  equations  that  is  exponential  in  the  number 
of  communications  channels  entering  a  stag*'  in  the  network.  A  program 
that  automatically  solves  these  equations  exactly,  given  a  description  of 
the  network,  is  presented  in  what  follows:  but  it  cannot  be  used  on  large 

'  As  will  tie  shown  in  Section  t.J. 
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networks,  as  t  lie  running  time  grows  too  quickly. 

Tims  an  approximation  method  must  he  used  to  estimate  the  block¬ 
ing  probability  of  larger  networks.  The  exact  solution  remains  useful,  not 
only  because  it  is  used  in  the  approximation  method,  but  because  it  allows 
some  evaluation  of  approximation  methods  through  comparison  with  exact 
solutio  ror  small  problems.  We  consider  two  approximation  methods. 

The  first  is  direct  simulation  of  the  network,  where  sample  input  loads 
are  selected  and  offered  to  the  simulation,  the  fraction  of  messages  blocked 
in  each  is  calculated,  and  a  blocking  probability  is  estimated.  The  second, 
which  is  more  satisfactory  because  it  achieves  the  same  error  bounds  in  less 
running  time  than  does  direct  simulation,  is  approximation  of  the  solution 
to  the  equations,  by  a  Monte  Carlo  method  that  we  shall  describe.  This  is 
similar  to  the  approach  taken  by  Harvey  and  Hills  in  [11].  Harvey  and  Hills 
were  considering  circuit-switched  telephone  networks  with  unique  paths:  but 
their  approach,  which  was  to  find  approximate  solutions  of  exact  equations, 
rather  than  exact  solutions  to  approximate  equations,  can  still  be  of  use 
here. 

1.2  Prior  Work 

The  earliest  work  in  analysis  of  the  performance  of  interconnection  networks 
was  driven  by  the  need  to  efficiently  switch  telephone  traffic.  Some  of  the 
earlier  work  on  interconnection  networks  and  their  performance,  by  Clos  [7] 
and  Benes  [3],  concentrated  on  the  design  of  non-blocking  networks,  networks 
for  which  a  connection  that  constitutes  a  bijective  mapping  from  sources  to 
destinations  can  always  be  accomplished  without  blocking. 

Non-isochronous  applications  such  as  shared-memory  references  in  a  mul¬ 
tiprocessor  can  better  tolerate  blocking,  and  thus  often  use  blocking  variants 
of  Banyan  networks,  as  presented  by  Goke  and  Lipovski  [9]. 

Patel  [21]  presented  a  probabilistic  analysis  of  the  blocking  probability 
of  delta  networks,  a  subset  of  the  more  general  class  of  Banyan  networks. 
His  work  assumed  that  all  sources  transmit  with  uniform  probability,  and 
that  all  destinations  are  selected  with  uniform  probability.  Bluiyan  [5]  has 
extended  Patel's  work  to  include  analysis  of  the  case  where  each  processor 
has  a  single  favorite  destination  that  is  nol  the  favorite  destination  of  any 
other  processor.  Kruskal  and  Snir  [15]  have  extended  Patel's  work  by  finding 
an  asymptotic  expression  for  the  blocking  probability  in  networks  with  large 
numbers  of  stages. 
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Analyses  that  model  the  buffering  that  must  be  used  due  to  blocking 
in  Banyan  networks  have  also  been  developed:  two  recent  examples  include 
t  lie  work  of  Merchant  [IS]  and  Lin  and  Klein  rock  [17].  These  models  can¬ 
not  be  used  for  multipath  networks,  however,  due  to  the  above-mentioned 
correlation  of  channel  loads  in  a  multipath  network. 

The  literature  on  performance  of  multistage  multipath  networks  is  more 
sparse.  Specific  topologies  are  usually  simulated,  as  in  [2].  [s].  [2d],  and  [10]. 
A  similar  problem  has  been  studied  in  the  context  of  telephone  switching 
systems  [12].  However,  in  telephone  switching  systems  the  model  is  one  of 
a  circuit -switched  network  where  the  holding  time  for  circuits  varies.  Fur¬ 
thermore.  in  the  methods  described  in  [12].  it  is  assumed  that  the  networks 
modeled  are  symmetric:  because  there  are  classes  of  asymmetric  networks 
that  are  of  interest.2  and  because  we  are  partly  interested  in  calculating 
blocking  probabilities  in  the  presence  of  (asymmetric)  faults,  these  methods 
are  not  satisfactory. 


1.3  Motivation 

Our  goal  in  this  work  is  to  provide  a  tool  that  can  be  used  by  multiprocessor 
architects  to  easily  compare  the  performance  of  competing  multistage  inter¬ 
connection  network  structures.  A  secondary  goal  is  to  provide  an  analysis 
that  highlights  some  of  the  aspects  of  interconnection  network  structure  that 
have  particular  bearing  on  performance. 

Almost  all  Banyan  networks  used  in  multiprocessors  to  date  have  been 
delta  or  omega  networks,  and  the  performance  of  these  has  been  studied 
extensively.  Our  contribution  is  in  providing  a  method  of  calculating  the 
throughputs  of  Banyan  networks  of  arbitrary  interconnection  structure  and 
with  unusual  switching  components.  The  method  allows  easy  modeling  of 
cases  with  general  destination  distributions  and  general  source  transmission 
probabilities. 

Multipath,  multistage  network  performance  has  been  less  widely  stud¬ 
ied.  The  correlation  of  channel  loads  can  have  significant  effects  on  the 
performance  of  these  networks.  The  methods  we  develop  calculate  the  joint 
probability  mass  function  of  groups  of  channels  between  stages  of  the  net¬ 
work  to  allow  calculation  of  multipath  network  performance  parameters.  We 
hope  that  the  multipath  network  designer  who  wishes  to  examine  the  results 
of  a  design  decision  will  be  able  to  achieve  some  insight  from  our  model. 

2E.g..  the  randomly-interwired  butterflies  of  [!(>]. 
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Random  interwiring  of  multipath  networks  (as  described  in  [Hi])  for 
fault -tolerance  yields  a  large  space  of  possible  network  structures;  one  might 
generate  a  number  of  these,  insert  faults  randomly  and  select  the  one  with 
the  best  performance.  In  [2].  Chong  t  /  al.  describe  the  use  of  simulation  to 
evaluate  different  circuit -switched  multipath  networks,  including  randomly- 
and  deterministically-interwired  networks.  The  method  we  develop  allows  a 
quick  measure  of  performance  in  fewer  steps  than  does  direct  simulation  of 
the  network. 


1.4  Approach 

We  use  a  simple  model  of  offered  traffic  in  our  calculation  of  blocking  prob¬ 
abilities  for  both  Hainan  and  multipath  networks.  In  our  model,  although 
different  inputs  in  the  network  can  have  varying  probabilities  of  transmis¬ 
sion.  we  assume  that  the  messages  presented  at  the  inputs  to  the  network 
are  produced  by  independent  memoryless  processes. 

This  model  is  known  to  be  optimistic.  The  throughput  calculated  in  an 
analysis  using  this  model  will  be  higher  than  the  throughput  calculated  in 
simulations  that  include  buffering,  or  in  more  detailed  analyses  that  model 
buffering.  We  can  understand  one  reason  for  the  optimism  of  the  model 
by  considering  that  it  cannot  account  for  multiple  conflicting  requests  pre¬ 
sented  to  the  network.  In  the  case  of.  say.  a  three-way  collision  between 
requests  competing  for  the  same  resource  in  one  cycle,  only  one  request  can 
be  serviced,  and  there  will  necessarily  be  a  collision  again  at  the  next  cycle 
between  the  remaining  requests. 

Patel  has  noted  the  optimism  of  this  memoryless  model  and  comments 
that  in  his  simulations  that  took  buffering  into  account,  the  probability 
of  successful  message  transmission  varied  only  slightly  from  that  predicted 
by  the  memoryless  model  [21].  Nussbaum  et  al.  examined  the  analogous 
assumption  for  circuit-switched  interconnects  and  reported  that  the  error  in 
the  memoryless  model  was  at  all  loads  less  than  10 (X .  and  suggested  that 
for  most  purposes  the  memoryless  mode]  should  probably  be  preferred  for 
its  simplicity  [20]. 

Bhandarkar  examined  in  particular  the  probability  that  a  memory  el¬ 
ement  in  a  distributed-memory  multiprocessor  system  would  be  busy,  and 
compared  his  analysis,  which  did  model  buffering  of  blocked  requests,  with 
a  memoryless  model  [4].  His  conclusion  was  that  where  the  ratio  of  the 
number  of  memories  to  the  number  of  processors  was  greater  than  0.75. 
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tin'  expected  number  of  busy  memories  in  t he  memoryless  model  is  always 
within  (i  to  X1/  of  that  in  the  buffering,  model. 

The  results  reported  by  Chang  it  a/,  were  similar:  in  examining  the 
throughput  of  multiprocessor  memories,  they  found  that  a  memoryless  model 
was  always  b  to  X1/  more  optimistic  than  the  results  they  generated  with  an 
analysis  that  modeled  queueing  of  memory  requests  [(>] . 

Given  that  our  primary  goal  is  to  provide  multiprocessor  architects  with 
a  means  of  comparing  the  performance  of  alternative  network  structures, 
we  deemed  the  known  optimism  of  the  memoryless  model  to  be  worth  the 
simplicity  it  affords,  especially  in  view  of  the  complexity  of  the  problem  of 
deriving  blocking  probabilities  in  multipath  networks. 

1.5  Outline 

In  the  remainder  of  this  document,  we  further  define  the  problem  of  calculat¬ 
ing  the  throughput  and  blocking  probability  of  a  multistage  interconnection 
network  and  present  methods  for  solving  it. 

In  Chapter  2  we  define  the  problem  and  our  model  specifically  enough  to 
allow  the  description  of  a  method  for  analyzing  the  performance  of  Banyan 
networks.  Chapter  :{  presents  that  method,  as  well  as  a  program  that  cal¬ 
culates  the  performance  parameters  numerically  or  symbolically. 

Chapter  4  further  defines  our  model  to  include  multipath  networks  and 
presents  equations  for  exactly  analyzing  the  performance  of  multipath  net¬ 
works.  Chapter  5  presents  means  of  approximating  the  performance  pa¬ 
rameters  of  multipath  networks.  Finally,  w^e  have  included  a  listing  of  our 
procedures  for  Banyan  network  performance  evaluation  in  an  appendix,  as 
these  were  compact  enough  to  make  such  presentation  practical. 


Chapter  2 

Problem  Statement 


2.1  The  Model 

An  indirect  network  is  one  in  which  the  network  switching  elements  are 
segregated  from  the  inputs  and  outputs  of  the  network,  as  in  Figure  2.1. 
The  message  sources  inject  messages  into  the  network  at  the  inputs,  which 
in  Figure  2.1  are  depicted  on  the  left  side  and  are  labeled  /o  through  I7. 
Messages  are  routed  through  the  network  and  arrive  at  the  message  sinks  on 
the  right  side,  labeled  0 o  through  0-.  In  a  multiprocessor,  the  network  input 
channels  might  connect  to  X  processing  elements,  and  the  output  channels 
might  connect  to  the  same  X  processing  elements. 

The  particular  class  of  indirect  networks  that  we  model  is  the  class  of 
multistage,  unbuffered,  synchronous,  packet -switched  networks.  Such  a  net¬ 
work  might  look  like  the  one  depicted  in  Figure  2.2.  This  network  has 
multiple  stages:  if  we  consider  the  stage  consisting  of  all  the  sources  to  be 
stage  0.  then  stage  1  consists  of  the  column  of  switching  elements  connected 
directly  to  the  sources:  stage  2  the  column  of  switches  to  the  right  of  stage 
1.  etc. 

The  networks  we  consider  are  self-routing:  each  message  contains  the 
information  necessary  to  route  the  message  from  the  source  where  it  is  in¬ 
jected  to  the  sink  that  is  its  destination.  No  global  information  is  used  in 
the  routing  process,  so  that  the  probability  mass  function  of  the  loads  on 
the  output  channels  of  a  switch  can  be  calculated  from  the  probabilities  of 
the  loads  on  the  input  channels.  As  a  simple  example,  in  the  indirect  cube 
of  Figure  2.2.  2  x  2  switching  elements  route  on  individual  bits  of  the  desti¬ 
nation  address,  starting  with  the  low-order  bit.  Tlmre  are  log2  X  =  3  address 
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Indirect 

Network 


Figure  2.1:  An  indirect  network  is  one  in  which  switching  elements  are 
segregated  from  the  inputs  and  outputs  of  the  network.  Messages  enter  the 
network  through  t he  input  channels  on  the  left  side,  and  are  routed  to  the 
output  channels  on  the  right  side. 


Stage  0 


Stage  1  Stage  2 


Figure  2.2:  An  8  x  S  Banyan  network.  This  is  a  unique-path  network  called 
an  indincl  rube,  multipath  networks  will  be  treated  in  Chapter  4. 
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hits:  the  switching  elements  at  stage  0  route  on  the  2°'s  bit:  those  at  stage 
L  on  the  2*'s  hit.  and  those  at  stage  2  on  the  2"’s  hit.  A  cleared  address  hit 
indicates  t hat  the  message  should  be  routed  through  the  upper  channel:  a 
set  address  hit  indicates  that  it  should  he  routed  though  the  lower  channel. 
Thus  a  message  addressed  to  destination  (i  =  1 1 0 >  leaves  stage  0  through 
an  upper  channel,  stage  l  through  a  lower  channel,  and  stage  2  through  a 
lower  channel. 

Blockiiif/  occurs  in  the  2  y  2  crossbar  when  two  messages  arriving  at 
the  inputs  are  both  to  be  routed  through  the  same  output  channel.  Both 
requests  cannot  be  serviced,  and  so  one  of  the  messages  is  routed  through 
the  output  channel,  and  the  other  is  blocked.  In  our  model,  both  messages 
have  equal  likelihood  of  being  routed  through  the  output  channel,  and  many 
switching  elements  behave  this  way:  but  one  might  easily  modify  the  analyses 
we  present  to  change  this  assumption. 

We  also  consider  networks  in  which  the  channels  between  stages  can  carry 
more  than  one  message.  Kruskal  and  Snir  referred  to  such  networks  as  dilated 
networks  [15].  and  we  follow  their  lead  here:  furthermore  we  call  switching 
elements  in  which  the  output  ports  can  pass  more  than  one  message  dilated 
switching  elements.  We  refer  to  each  of  the  dilated  output  porls  as  a  loe/ical 
direction.  If  a  switch  has  V  input  ports,  each  of  which  can  receive  a  single 
message,  and  .1/  output  ports,  each  of  which  can  send  up  to  A  messages 
simultaneously,  we  call  it  an  X  x  M.  dilation  I\  switch. 

We  calculate  the  throughput  of  the  network  under  the  following  assump¬ 
tions: 

•  The  processes  generating  messages  at  the  sources  are  independent  and 
memoryless.  With  some  specified  probability  />;.  each  source  i  gen¬ 
erates  or  fails  to  generate  a  single  message  at  the  beginning  of  each 
cycle.  Each  generated  message  is  directed  to  a  stage  1  switch. 

•  The  network  is  synchronous:  at  each  cycle  messages  move  from  stage 
i  to  stage  /  T  J  - 

•  The  net  work  is  treated  as  unbuffered  (as  described  in  Section  1.4):  if  a 
message  is  blocked  at  some  stage,  it  is  considered  to  be  lost,  and  does 
not  in  any  way  affect  the  future  states  of  the  system. 

•  If  .4q..4i....  are  random  variables  representing  the  addresses  of  mes¬ 
sages  generated  in  some  particular  cycle  by  message  sources  0.1 . 

then  the  .1,  are  independent  and  identically  distributed:  the  distribu¬ 
tion  can  be  specified  as  a  parameter  of  the  model. 
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\\V  define  our  model  further  in  ('lia|>ter  1.  extending  it  as  necessary  for 
multipath  networks. 


2.2  The  Problem 


We  are  interested  in  deriving  the  bandwidth,  or  throughput,  of  a  multistage 
interconnection  network  that  is.  the  expected  number  of  messages  it  de¬ 
livers  in  a  cycle.  We  calculate  this  number  by  finding  the  probability  mass 
functions1  of  the  loads  on  channels  leading  to  sinks. 

Suppose  that  the  network  has  \I  sources.  Call  the  probability  that  the 
/tli  source  generates  a  message  in  a  given  cycle  Pt.  If  we  say  that  B  is 
the  bandwidth  and  I's  is  the  probability  of  successful  message  transmission, 
then  we  may  calculate  /%■  as  the  ratio  of  B  to  the  expectation  of  the  input 
message  loading.  />■  will  vary  with  the  input  loading,  because  of  internal 
blocking  in  the  network.  B.  too.  will  vary  because  of  internal  blocking  and 
also  directly  with  the  number  of  messages  entering  the  network.  Thus  we 
can  better  express  P$  and  B  as  functions  of  the  P,.  giving  us  the  relation: 


PslPo.r t . P.u-i)  = 


BiPo.Pt . P\,-i. 
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yvU-1 p 

Thus  our  problem  is  finding  the  probability  mass  functions  of  the  loads  on 
channels  leading  to  sinks.  These  probability  mass  functions  can  also  be  used 
to  specify  other  information  besides  mean  throughput:  if  the  network  is  not 
symmetric,  or  if  a  non-uniform  destination  address  distribution  for  injected 
messages  is  specified,  or  if  different  sources  are  specified  to  have  different 
probabilities  of  message  generation,  then  the  loads  on  individual  channels 
leading  to  sinks  will  be  of  interest  in  find  the  effects  of  the  asymmetries  on 
traffic  to  particular  destinations. 

The  quantities  B  and  Ps  will  typically  vary  smoothly  with  the  source 
transmission  probabilities  P,.  Let  us  consider  a  simple  case.  If  the  destina¬ 
tion  dist  ribution  is  uniform,  all  sources  /  have  equal  probability  of  generating 
messages,  and  we  vary  P,  between  0  and  i  for  the  network  of  Figure  2.2. 
the  resulting  graphs  for  the  bandwidth  and  probability  of  successful  message 
transmission  are  as  shown  in  Figures  2.3  and  2.4.  respectively. 

The  probability  of  successful  message  transmission  is  close  to  1  when 
there  are  very  few  messages  injected  into  the  network,  because  there  is  little 


'  Joint  probability  mass  functions,  in  t tie  case  of  multipath  networks. 
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Figure  2.3:  Bandwidth  (labeled  E[A]  )  plotted  versus  message  generation 
probability  (labeled  PO)  for  the  network  of  Figure  2.2.  Here  the  destination 
address  distribution  is  uniform. 


Figure  2.4:  Probability  of  successful  message  transmission  (labeled 
P{Success})  plotted  versus  message  generation  probability  (labeled  PO)  for 
the  network  of  Figure  2.2.  with  a  uniform  destination  address  distribution. 
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blocking  in  a  nearly  empty  network.  /\  decreases  as  the  nninber  of  mes¬ 
sages  injected  into  the  network  increases.  The  bandwidth  or  throughput 
starts  at  0.  when  no  messages  art'  being  injected  into  t  lie  network,  and.  be¬ 
cause  of  blocking,  increases  less  than  linearly  as  the  probability  of  message 
t ransmission  increases. 


Chapter  3 

Performance  of  Banyan 
Networks1 

3.1  Introduction 

In  this  chapter  we  present  a  method  of  calculating  the  throughput  of  a 
Banyan  network.  As  described  in  Section  1.2.  Patel  [21]  and  Kruskal  and 
Snir  [15]  have  presented  solutions  to  this  problem  for  regular  variants  made 
up  of  crossbar  switching  devices,  but  we  present  a  method  that  works  for 
Banyan  networks  of  arbitrary  interconnection  structure  and  allows  modeling 
of  some  unusual  switching  devices. 

Consider  first  the  probability  mass  function  of  the  message  load  on  a 
single  channel  in  a  Banyan  network.  The  channel  may  either  be  carrying  a 
message,  in  which  case  its  message  load  is  one.  or  it  may  be  idle,  in  which 
case  its  message  load  is  zero.  Let  the  random  variable  !  denote  the  message 
load.  The  two  values  that  /  can  take  on  partition  the  space  of  possible 
loading  configurations  for  the  network  into  two  disjoint  subsets.  I  is  then 
a  Bernoulli  random  variable,  and  we  use  the  notation  p/(/0)  to  denote  the 
value  of  its  probability  mass  function  at  /().2  We  denote  the  value  of  the 
2-transform  of  /' s  probability  mass  function  at  r  with  the  notation  pj (;). 

Our  approach  will  be  to  define  three  operations  on  the  probability  mass 
functions  of  channel  loads.  These  are  called  bundling,  switching,  and  con¬ 
centration.  They  are  represented  graphically  as  depicted  in  Figure  3.1.  We 

'The  work  described  in  this  chapter  was  performed  jointly  with  Dr.  Thomas  F.  Knight. 
Jr.  and  has  been  described  in  [14]. 

2ln  later  sections  we  will  also  use  the  notation  P { /  =  /0}.  when  this  is  convenient. 
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I  ip;ur«'  -FI:  (a)  The  symbol  for  l»u lulling  two  input  bundles  into  oik*.  (b) 
The  symbol  for  ronront rating  j  channels  to  /.•  channels.  ( <• )  I  In*  symbol  for 
switching  with  probability  »/  to  tin*  top  output  channel.  and  I  -  </  to  tin* 
bottom  output  channel. 

compose  these  operations  to  model  switching  element s.  and  further  accord¬ 
ing  to  the  interconnection  structure  of  the  network.  The  result  is  an  oper¬ 
ation  that  transforms  the  probability  mass  functions  of  the  loads  on  input 
channels  to  the  probability  mass  function  of  the  load  on  an  output  channel. 


3.2  Loads  on  Banyan  Network  Channels  at  a  Sin¬ 
gle  Stage  are  Independent 

We  require  a  simple  proof  to  proceed.  We  will  be  forming  the  sum  of  the 
loads  on  distinct  channels  in  a  single  stage  in  a  Banyan  network,  and  thus 
we  need  to  understand  how.  if  at  all.  the  random  variables  we  are  summing 
are  correlated.  It  turns  out  that  these  loads  are  in  fact  independent.  A  proof 
for  the  special  case  of  delta  networks  is  presented  in  [21]:  here  we  present  a 
different  proof  for  the  general  case. 

The  proof  is  straightforward.  .Note  first  that,  if  messages  are  generated 
at  source  nodes  by  mutually  independent  random  processes,  and  the  sets 
of  messages  on  distinct  channels  entering  a  switching  node  originate  at  dis¬ 
joint  sets  of  source  nodes,  then  the  loads  on  those  channels  are  necessarily 
independent. 

We  now  claim  that  the  sets  of  messages  on  distinct  channels  entering 
any  switching  node  in  a  Banyan  network  satisfy  this  criterion:  i.e..  they 
originate  at  disjoint  sets  of  sources. 
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For.  consider:  if  channel  .1  and  channel  II  an*  two  channel*  entering,  a 
switching  node,  and  a  message  on  channel  .1  and  a  message  on  channel  // 
originate  at  a  single  source,  then  it  must  he  the  case  that  at  least  two  paths 
exist  from  that  source  to  any  sinks  accessible  Irom  the  switching  node:  one 
path  that  uses  channel  .1  and  one  that  uses  channel  li.  Hut  t  his  is  impossible 
in  a  Banyan  network,  as  Banyan  networks  are  in  lad  those  in  which  there 
is  exactly  one  path  front  each  source  to  each  sink. 

Thus  the  set s  of  messages  on  distinct  channels  entering  any  switching 
node  in  a  Banyan  network  must  originate  at  disjoint  set*  of  sources,  and  so 
the  loads  on  the  channels  entering  any  switching  node  in  a  Banyan  network 
must  be  mutually  independent,  as  was  to  bo  proved. 

3.3  Bundling 

We  call  the  operation  of  summing  the  loads  on  a  group  of  channels  buinlliny. 
We  will  call  such  a  group  of  channels  a  biinilh. 

Because  channel  loads  are  independent,  if  we  are  summing  loads  a  and 
b.  then  we  form  the  convolution  of  their  probability  mass  functions.  We  use 
the  notation 


P  \p«  ( do )  •  /•*/.(  b„ )]  =  }>,,  ( a„ )  *  ph{  M  ( d.l ) 

Of  course,  this  operation  can  be  performed  on  bundles,  as  well  as  on 
single  channels.  The  result  of  bundling  two  bundles  composed  respectively 
of  n  and  in  single  channels  is  a  bundle  whose  load  can  lake  on  values  rang¬ 
ing  from  0  through  ii  +  in.  We  depict  in  Figure  .{.2  one  possible  loading 
probability  mass  function  of  a  bundle  composed  of  X  single  channels. 

In  the  2-domain,  bundling  becomes  multiplication  of  the  2-transforms 
of  the  loading  probability  mass  functions  in  question: 

2  [P  [p„  ( "o )  •  Phi  6() )]]  =  p!,(:)  ■  pi  (:)  ( d.2 ) 


3.4  Concentration 

Suppose  in  an  A  x  M.  dilation  K  switch4  more  than  A  arriving  messages 
are  to  be  routed  in  a  particular  logical  direction.  Some  of  tin*  messages  are 

See  Section  J.l  (or  an  explanation  of  dilated  switches. 


ni.\rn:it ffriormaxci:  of  ha  vv\.v  vatu  okas  h> 
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Figure  3.2:  Loading  probability  mass  function  for  an  eight-channel  bundle, 
where  each  channel  carries  a  message  with  probability  1/2. 

then  blocked  and  must  be  dropped.  We  call  the  operation  that  corresponds 
to  this  situation  concentration. 

More  specifically,  suppose  we  have  a  bundle  of  .V  single  channels,  whose 
load  we  call  a.  and  we  wish  to  direct  messages  from  it  into  a  bundle  of  A 
single  channels,  whose  load  we  call  b.  Of  course  if  V  <  A  .  Ph(io)  =  p,,  (  /q  )  for 
all  loads  l0.  because  in  this  case  none  of  the  messages  on  the  input  bundle 
will  ever  be  blocked.  If  A'  >  A",  we  calculate  the  probability  mass  function 
of  b  as  follows: 

•  Because  t  he  output  bundle  carries  fewer  than  A  messages  exactly  when 
the  input  bundle  carries  fewer  than  A'  messages,  we  have  that  pi,(lo)  = 
pa(l0)  where  /0  <  A'. 

•  The  output  bundle  will  carry  A'  messages  whenever  the  input  bundle 
carries  at  least  A’  messages:  thus  we  have  that  pi,{K)  =  J2;=k  /A Oo )  • 

•  Because  t  he  output  bundle  cannot  carry  more  than  A’  messages. /^(/o)  = 
0  for  /o  >  I\ . 


Intuitively,  then,  we  can  think  of  the  effect  of  concentration  on  the  input 
loading  probability  mass  function  as  truncating  it  at  A.  by  setting  prob- 
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Figure  3.3:  6-concent  rat  ion  of  the  loading  probability  mass  function  of  Fig¬ 
ure  3.2. 

abilities  for  loads  greater  than  K  to  0  and  adding  to  the  probability  for 
K  the  probabilities  for  all  greater  input  loads.  In  Figure  3.3  we  show  the 
result  of  concentrating  to  6  channels  the  loading  probability  mass  function 
of  Figure  3.2. 

If  <*)( » )  denotes  the  value  of  the  unit  impulse  function  at  n.  we  can  express 
the  loading  probability  mass  function  of  an  .Y-channel  bundle  as  an  impulse 
train  with  value  A',  at  i: 

v 

Pl(lo)  =  ^2  k,Hh>  —  i) 

t=0 

If  ii  ( u)  the  value  of  the  unit  step  function  at  v.  we  can  express  concen¬ 
tration  of  this  .Y-channel  bundle  to  a  A’-channel  bundle  as  follows: 


C\,k  [ l>l{  (o )]  =  Pdh) )  u(  h  -/»)+  I  J2  M  ^ “  /v  )  ( ) 

\/l=/v  +  l  / 


0.25' 

0.2 

0.15- 

0.1 

0.05 


In  the  2-domain,  we  have 
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is 


2  [t’v./v  h>,Uo)}}  =  !>](:) 


E  rtt-'i)-'1  +  (  5Z  ^(/i) 

/|=/\'  +  l  \/i=A'+l 


Combining  the  two  summations,  we  get 

v 

2  [C.v.a-  b/(/u )]]  =  /»/■(  c )  +  £  Mh)(=K  -  =h  )  ( 3.4 ) 

/ 1 = /\‘ + 1 


3.5  Switching 

We  call  the  elementary  operation  of  directing  the  messages  on  a  bundle  to 
two  other  bundles  of  the  same  width  as  the  input  bundle  switching.  Here  we 
do  not  mean  to  use  the  term  in  precisely  the  sense  that  it  is  used  when  we 
speak  of.  t.g..  a  2  X  2  switch.  In  the  elementary  operation  we  call  switching, 
no  blocking  is  modeled:  no  messages  can  be  lost.  What  we  are  modeling 
instead  is  the  direction  of  messages  to  separate  ports  in  routing. 

We  specify  the  probability  that  the  messages  on  the  input  load  are 
switched  in  the  direction  of  the  output  load.  This  probability  is  calculated 
in  accordance  with  the  destination  address  distribution  (as  will  be  described 
in  Section  3.6):  but  as  an  example,  for  the  2  x  2  crossbars  in  the  network  of 
Figure  2.2  under  a  uniform  destination  address  distribution  the  probability 
specified  for  the  switching  operation  will  be  1/2. 

Thus  the  switching  operation  is  performed  on  an  input  loading  proba¬ 
bility  mass  function  and  a  switching  probability,  and  its  result  is  an  output 
loading  probability  mass  function.  Call  the  load  on  the  input  bundle  a.  and 
that  on  the  output  bundle  b.  and  say  that  the  input  bundle  (and  perforce 
the  output  bundle)  is  composed  of  :Y  single  channels. 

We  form  pi,( b(l )  by  conditioning  on  the  number  of  messages  on  the  input 
bundle: 


x 

Pb(  b()  )  =  Pb\a  ( lj0  |  «0  )  Pa  ( «0  ) 

'JO=U 

To  evaluate  the  conditional  probability,  let  g  be  the  probability  that 
an  input  message  is  switched  to  the  output  bundle.  By  independence  of 
message  destinations,  each  message  is  switched  independently,  and  thus  the 
number  of  messages  switched  to  the  output  bundle  is  binomially  distributed. 
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because  it  is  the  number  of  successes  in  a0  independent  Bernoulli  trials  with 
probability  q  of  success: 

/>t-|„(M«o)=  ^)f/,0(l  -ry  )"«-'•« 

Substituting,  we  have 


<S[p„(«o)..t>]  =  l>h{k))  =  £  {« o)  1  ~  </)'°  h°  (3.’>) 

«c=U  W 

In  t Tie  3-domain,  we  take  an  analogous  approach.  Note  that  the  number 
of  messages  routed  to  the  output  channel  is  the  sum  of  a  random  number 
of  identically  distributed  random  variables.  The  number  of  summands  is 
the  number  of  messages  on  the  input  load.  The  summands  themselves  are 
Bernoulli  random  variables  that  are  1  when  the  message  in  question  is  routed 
to  the  output  bundle  and  0  when  it  is  not. 

If  we  use  the  random  variable  c  to  denote  one  of  the  summands,  its 
probability  mass  function  is  given  by 

K-(c o)  =  (1  -  <l)b(<o)  +  <lHc o  -  1) 

with  3-transform 

pj(z)  =  1  -q  +  qs 

Thus  we  have 

3  [c9[pa(flo  )•</]]  =  p1, 

=  Pl(  +  (3.6) 

Of  course,  if  we  cascade  K  switching  operations  whose  probabilities  are 

q\ ,  r/2 . r//c-  the  effect  on  the  probability  that  an  individual  message  is 

routed  to  the  output  bundle  is  the  same  as  if  we  performed  one  switching 
operation  with  q  =  flfLi  </«'• 

The  (predictable)  effect  of  switching  upon  a  loading  probability  mass 
function  is  to  decrease  the  mean.  The  effect  on  the  distribution  of  Figure  3.2 
of  switching  with  q  —  1/2  is  shown  in  Figure  3.1. 
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P { load=n} 

illli- 
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Figure  3.4:  The  effect  of  switching  the  loading  probability  mass  function  of 
figure  3.2  with  probability  1/2. 

3.6  Deriving  Switching  Probabilities  from  Mes¬ 
sage  Destination  Distributions 

The  technique  we  use  for  deriving  switching  probabilities  from  message  des¬ 
tination  distributions  has  also  been  used  by  Lin  and  Kleinrock  in  [17]. 

As  described  in  Section  2.1.  the  addresses  of  distinct  messages  injected 
into  the  network  are  independent  and  identically  distributed.  Suppose  that 

the  message  sinks  are  numbered  0.1 . A  —  1.  and  consider  a  switch  A 

for  which  the  set  of  accessible  message  sinks  is  S.  Suppose  that  .V  has  M 
output  ports.  By  uniqueness  of  paths  in  a  Banyan  network,  the  ports  must 

have  disjoint  sets  S\.Sz . S.\i  of  a;  <  essible  destinations,  and  because  the 

destinations  accessible  through  the  output  ports  are  all  the  destinations,  we 
must  have  that  U  =i  Si  =  S.  An  example  is  depicted  in  Figure  3.5. 

We  wish  to  know  the  probability  that  an  arbitrary  message  arriving 
at  switch  A"  is  directed  in  direction  i.  Suppose  that  some  message  IF  with 
destination  given  by  the  random  variable  D  is  injected  into  the  network.  The 
value  we  are  looking  for  is  the  conditional  probability  that  IF  is  addressed 
to  a  destination  in  the  set  S;.  given  that  it  has  arrived  at  switch  .V.  We 
have 


0.3- 
0.2  5 
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0.15 
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Figure  3.5:  The  destination  s<  t  of  the  switch  A  is  {1.3.5.  7}.  The  destina¬ 
tion  set  of  the  upper  channel  is  {1.5}:  that  of  the  lower  channel  is  {3.  7}. 


P{D  €  S,  i  D  €  S} 


P {(D  €  St)n(D  €  .V)} 
P {D  €  5} 

P {De  S,} 

p {De  s} 

E,es,  W  =  «} 

E,esP{^  =  ^} 


(3.7) 


where  the  last  expression  follows  from  mutual  exclusivity  of  destinations. 


3.7  Example:  the  2A  x  2A  Crossbar 

As  an  example  of  both  the  symbolic  and  numeric  use  of  our  method,  we 
derive  a  well-known  expression  for  the  throughput  of  the  2k  x  2k  crossbar. 

We  form  a  schematic  representation  of  the  crossbar  with  a  combination 
of  our  operators.  First  wre  construct  a  bundle  of  2k  channels  by  bundling  the 
single-channel  inputs  k  times.  Then  wre  switch  the  messages  on  the  bundle 
k  times,  to  form  2k  bundles,  each  of  which  can  hold  2>;  messages.  Finally  we 
concentrate  these  2*-wide  bundles  to  single  channels,  thereby  modeling  the 
blocking  that  takes  place  in  the  crossbar.  Figure  3.6  shows  the  result  for  an 
8  x  8  crossbar. 
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For  brevity's  sake,  in  our  analysis  we  assume  that  all  input  channels  have 
a  single  probability  Q  of  transmitting,  ami  that  the  destination  address  dis¬ 
tribution  is  uniform.  It  will  be  evident  that  the  derivation  would  otherwise 
proceed  in  the  same  fashion,  but  would  be  more  lengthy. 

Suppose  that  the  input  channels  have  probability  Q  of  transmitting  dur¬ 
ing  a  cycle.  If  we  call  the  load  on  an  input  channel  ij.  the  loading  probability 
mass  function  for  an  input  channel  will  then  be 

P„(  i/o )  =  Qi>{  Uo  ~  1 )  +  (1  -  Q )  H  t/u ) 


with  2- transform 

,,]{  =  )  =  Q:  +  {[-Q) 

Bundling  k  times,  we  get  for  the  transform  of  the  probability  mass  function 
of  the  load  .ty  on  the  bundle  entering  the  switches 

Let  xs  be  the  load  on  a  channel  after  the  stages  of  switching,  but  before 
concentration.  We  switch  k  times  with  probability  1/2  at  each  stage,  the 
result  being  the  same  as  switching  once  with  probability  J /'2k: 


To  make  the  expression  clearer,  we  substitute  M  —  ‘2k\  rearrange,  and 
invert  the  2- transform: 
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We  can  save  ourselves  some  work  in  performing  t lie  concentration  from 
2k  (that  is.  M)  channels  to  one  channel  by  making  use  of  the  following 
device.  We  note  that,  because  we  are  concentrating  to  a  single  channel,  ihe 
only  possible  loads  for  the  channel  are  0  and  1.  We  recall  from  Section  d.l 
that  concentration  will  retain  the  probability  for  a  load  of  0.  as  0  is  less 
than  the  maximum  load  on  the  channel.  The  probability  for  a  load  of  1  will 
necessarily  be  the  complement  of  that  for  0.  First  we  take  the  probability 
that  .r =  0: 


IF.  (0) 


We  simplify  the  expression  by  noting  that  the  terms  where  /  ^  \I  will  all 
be  0: 


IF  A0) 


(Q\*‘  (Sl_  _ 

\m)  \q 


-  M 


If  we  call  the  load  on  an  output  channel  /.  the  loading  probability  mass 
function  for  an  output  channel  is  then  given  by 


Wo  -  i: 


iji(i o)  =  -  Y[j  ^  ~  v  -  ~\i / 

The  expected  load  on  an  output  channel  is  then 


-H-sn 


There  are  M  output  channels,  so  the  expected  load  on  all  of  them,  or  the 
throughput  of  the  crossbar,  is 


The  expected  load  on  an  input  channel  was  Q.  so  that  the  total  expected 
input  load  is  MQ.  We  can  now  use  Equation  (2.1)  to  derive  the  probability 
of  successful  message  transmission  in  M  x  M  crossbar: 


Ps  =  - - - 
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Figure  3.7:  The  probability  of  successful  message  transmission  (labeled 
P{Success})  as  a  function  of  the  probability  that  a  source  is  transmitting 
(labeled  Qi ).  in  an  eight-by-eight  crossbar  network  with  a  uniform  destina¬ 
tion  distribution. 

We  plot  the  probability  of  successful  message  transmission  against  the  source 
transmission  probability  in  Figure  3.7. 

3.8  Automatic  Calculation  of  Numerical  Values 
for  Performance  Parameters 

We  present  in  Appendix  A  a  package  of  Mathematica  procedures  that  imple¬ 
ment  the  elementary  operations  we  have  described.  Of  course  the  operations 
are  easily  implemented  in  other  languages,  but  it  is  advantageous  to  use  a 
symbolic  algebra  package  if  one  wishes  to  derive  symbolic  expressions  for 
performance. 

We  can  use  this  package  to  implement  procedures  that  operate  on  source 
loading  probability  mass  functions  and  return  the  loading  probability  mass 
functions  for  channels  leading  to  sinks.  As  an  example,  we  turn  again  to  the 
network  of  Figure  2.2.  The  tree  whose  root  is  one  of  the  sinks  and  whose 
leaves  are  the  sources  is  depicted  in  Figure  3.8. 

Assuming  for  clarity's  sake  that  the  destination  address  distribution  is 
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Stage  0  Stage  1  Stage  2 


Figure  3.N:  The  tree  of  channels  leading  to  a  sink  in  the  network  of  Fig¬ 
ure  2.2. 

uniform,  we  might  use  our  package  to  model  the  2  x  2  crossbar  as  follows: 

crossbar2x2[PMFl_ ,  PMF2_]  := 

concentrate [switch [bundle [PMF1 ,  PMF2] , 

1/2], 

1] 

If  we  also  assume  (again,  in  the  interests  of  brevity:  it  will  be  clear  t hat 
the  calculation  in  the  general  case  is  no  more  complex)  that  all  sources 
transmit  with  equal  probability,  we  can  take  advantage  of  the  symmetry  of 
the  network  to  calculate  the  loading  probability  mass  function  of  a  channel 
leading  to  a  sink  as  follows: 

threeStageDelta[q_]  := 

Block [{inputPMF,  stagelPMF,  stage2PMF>, 
inputPMF  :=  CCl-q) ,  q] ; 

stagelPMF  :=  crossbar2x2 [inputPMF,  inputPMF]; 
stage2PMF  :=  crossbar2x2 [stagelPMF,  stagelPMF]; 
crossbar2x2 [stage2PMF ,  stage2PMF] 

] 
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Here  the  probability  that  a  source  is  transmitting  is  specified  as  the  input 
parameter:  three  levels  of  switching  are  performed:  the  result  of  the  last  is 
returned. 

We  may  calculate  the  resulting  bandwidth  and  probability  of  successful 
message  transmission  from  Equation  (2.1).  as  is  done  in  Section  4.7.  The 
results  are  plotted  in  Figures  2.4  and  2.1.  on  page  11. 


3.9  Modeling  an  Unusual  Switching  Component 

We  use  an  example  to  illustrate  the  modeling  of  an  unusual  switching  com- 
ponent  an  X  x  4.  dilation  2  switch.4  Such  switches  are  more  usually  used 
in  multipath  networks,  as  we  shall  see  in  Chapter  1.  but  Banyan  networks 
with  replicated  links  are  not  unknown,  and  Kruskal  and  Snir  have  analyzed 
regular  variants  in  [15]. 


3.9.1  An  Application  for  an  8  x  4,  dilation  2  Switch 

We  can  use  standard  4x4  crossbars  to  build  a  lb  x  lb  indirect  binary 
cube  network,  as  depicted  in  Figure  4.9.  The  methods  of  analysis  of  the 
performance  of  this  network  follow  directly  those  of  Sections  4.7  and  4.x. 

As  an  alternative,  we  might  choose  to  use  a  different  sort  of  switching 
element  in  the  first  stage,  to  improve  performance.  This  switching  element 
-  an  X  x  4.  dilation  2  switch  has  eight  input  channels,  but  switches  mes¬ 
sages  in  only  four  logical  directions,  with  two  output  ports  for  each  of  these 
logical  directions.  If  only  one  message  is  switched  in  a  particular  direction, 
the  output  port  is  picked  randomly.  If  two  messages  are  switched  in  the 
direction,  both  ports  are  used:  if  more  than  two  messages  are  switched  in 
the  direction,  the  excess  messages  are  blocked. 

In  Figure  4.10.  we  show  how  we  might  modify  the  first  stage  of  the  16x  lb 
indirect  cube  network  to  make  use  of  the  dilated  switching  component.  The 
second  stage  must  still  use  4x4  crossbars,  to  select  the  particular  output 
channel  to  which  the  message  is  directed. 

Although  it  might  appear  that  we  have  constructed  a  multipath  network 
here,  in  fact  we  have  not.  The  numbers  appearing  next  to  output  ports  on 
the  dilated  components  in  Figure  4.10  are  logical  direction  numbers,  and  it 
is  to  be  noted  that  both  outputs  for  a  particular  logical  direction  lead  to 

4See  Section  J.l  for  an  explanation  of  this  terminology. 
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Figured.!):  A  10  x  10  indirect  binary  cube  network  built  from  standard  1  x  1 
crossbars. 
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Figure  3.11:  Schematic  representation  of  an  eight -by- four,  dilation  two 
switching  component.  The  switching  probabilities  are  for  a  uniform  des¬ 
tination  address  distribution. 

the  same  second-stage  switch:  the  model  will  reflect  this.  Thus  we  have  four 
two-channel  bundles  leading  from  each  first-stage  component. 

3.9.2  Deriving  Expressions  for  the  Performance  of  the  8x1, 
Dilation  2  Switch 

A  schematic  model  of  the  8x4.  dilation  2  switch  is  shown  in  Figure  3.11. 
Note  that,  in  our  model,  the  only  difference  between  this  component  and 
the  8  x  8  crossbar  of  Figure  3.6  is  that  there  are  only  two  stages  of  switching, 
and  the  final  concentration  is  to  two  channels,  rather  than  to  one.  Here  we 
gain  an  intuition  from  our  model:  we  noted  that  concentration  was  where 
blocking  occurred.  Because  there  is  less  concentration  in  the  new  network, 
there  will  be  less  blocking. 

The  derivation  follows  that  of  the  2k  x  2k  crossbar  in  Section  3.7.  Again, 
for  brevity's  sake  we  assume  a  uniform  destination  distribution.  Call  the 
load  on  an  input  channel  y.  Assuming  all  inputs  transmit  with  probability 
Q.  the  loading  probability  mass  function  for  an  input  channel  is 


l>n(  .5/0 )  —  QHyu  ~  1 )  +  ( 1  ~  Q)b(y o ) 
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with  ^-transform 

pU  =  )  =  Q:  +  d-Q) 

(  all  t ho  load  on  the  bundle  entering  the  switches  .i\..  The  transform  of  the 
probability  mass  function  of  .r,  is  then 

/£(-->=  W'm)’ 


We  switch  twice  with  probability  1/2  each  time,  the  result  being  the  same 
as  switching  once  with  probability  1/1.  If  .r,  is  the  load  on  a  channel  after 
the  two  stages  of  switching,  we  have 


Now  we  invert  the  transform: 

PrA*s 0)  =  \Q  ~  ~  <8  “  0) 


We  concentrate  to  two  channels  here,  so  that  it  still  saves  us  some  work 
to  use  the  technique  we  did  for  the  crossbar,  but  it  will  be  a  little  more 
complicated  to  do  so. 

We  take  the  probability  that  xa  =  0  first.  The  sum  will  be  zero  whenever 
I  ^  8,  giving  us: 


For  ,rs  =  1.  the  sum  will  be  zero  whenever  /  ^  7.  giving  us: 


PrA  0) 
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Call  the  load  on  a  two-channel  output  bundle  I.  We  know  that  />/ ( 0 )  = 
//,.  ( 0)  and  /;/( 1 )  =  1).  The  only  other  rase  for  a  two-channel  bundle  is 

/  =  2.  so  the  probability  for  /  =  2  must  be  the  complement  of  the  other  two 
cases,  so  we  have  for  the  probability  mass  function  of  /: 


l’i(h))  = 


+ 


+ 


HI  o-2) 


By  our  assumptions  of  uniformity,  all  four  output  bundles  have  the  same 
loading  probability  mass  function,  and  so  the  throughput  of  the  switch  is 
E[4/]: 


E[4/]  =  4 


The  expected  load  on  an  input  channel  was  Q.  so  that  the  total  expected 
input  load  is  SQ.  As  for  the  crossbar,  we  use  Equation  (2.1)  to  derive  the 
probability  of  successful  message  transmission: 


Q(l-2)7+l-(l-2);(l  +  22) 

Fs  -  - Q  " 

Q 

Q 

The  probability  of  successful  message  transmission  is  plotted  against  the 
source  transmission  probability  in  Figure  3.12. 

3.9.3  Performance  of  the  8x4,  Dilation  2  Switch 

We  can  use  our  package  of  Mathematica  procedures  to  write  a  procedure  for 
the  8x4.  dilation  2  switch,  as  follows: 

eightXfourD2[q_]  := 

Block [{bundled,  switched,  instages,  outstages}. 
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Figure  3.12:  Probability  of  successful  message  transmission  plotted  against 
source  transmission  probability  for  an  8  X  4.  dilation  2  switch,  under  a 
uniform  destination  address  distribution. 

instages  =  3; 
out st ages  =  2; 
bundled  =  {(1-q),  q>; 

Do[bundled  =  bundle [bundled,  bundled],  {instages}]; 
switched  =  bundled; 

Do [switched  =  switch  [switched ,  .5],  {out st ages }] ; 
concentrate [switched,  2]] 

We  will  need  a  four- by-eight,  input  dilation  two  crossbar  for  the  second 
stage: 

crossbar2x4in2 [stageTwoPHF_]  := 

Block [{bundled,  switched}, 

bundled  =  stageTwoPMF; 

bundled  =  bundle [bundled,  bundled]; 

switched  =  bundled; 

Do [switched  =  switch [switched,  .5],  {2}]; 
concentrate [switched,  1]] 


CHAPTER  :l.  PERFORMASCE  OF  H ASYAS  SETWORKS 


34 


E  [A] 


Figure  3.13:  The  bandwidth  of  the  16  x  16  indirect  cube  made  from  4x4 
crossbars,  as  depicted  in  Figure  3.9.  is  shown  dashed.  The  bandwidth  of  the 
variant  with  a  first  stage  made  from  8  x  4.  dilation  2  switches,  as  depicted 
in  Figure  3.10.  is  shown  in  solid  black.  Both  are  plotted  against  the  source 
transmission  probability,  for  a  uniform  destination  address  distribution. 

Now  we  can  specify  a  procedure  that  yields  as  output  the  probability 
mass  function  of  the  load  on  a  channel  leading  to  a  sink,  given  the  probability 
that  a  source  is  transmitting: 

eightXfourD2indirectl6[q_]  := 

Block [{f irstStageOut}, 

(*  input  of  first  stage  is  just  q  *) 

(*  returns  LPHF  for  2-wide  channel  *) 
f irstStageOut  =  eightXfourD2[q] ; 

(*  now  feed  to  4x4  crossbars  and  return  result  *) 
crossbar2x4in2 [f irstStageOut] ] 

We  plot  in  Figure  3.13  the  bandwidth  for  the  16  x  16  indirect  cube  made 
from  4x4  crossbars,  and  that  for  the  variant  with  a  first  stage  made  from 
8x4,  dilation  2  switches.  It  will  be  seen  that,  as  predicted,  the  performance 
of  the  network  built  with  the  dilated  part  is  better. 


Chapter  4 

Analyzing  the  Performance 
of  Multipath  Networks 

4.1  Introduction 

In  the  previous  chapter  we  have  presented  a  method  of  analysis  of  Banyan 
network  performance.  But  as  we  discussed  in  the  introduction.  Banyan 
networks,  while  amenable  to  analysis,  are  not  intrinsically  fault-tolerant. 

We  present  in  this  chapter  a  method  of  analysis  of  multipath  networks. 
The  performance  parameters,  and  the  model,  are  much  the  same  as  for 
Banyan  networks:  but  the  requirement  of  unique  paths  and  thus  indepen¬ 
dence  of  channel  loads  is  removed. 

We  leave  behind  the  scheme  of  using  elementary  operations  to  build  de¬ 
scriptions  of  switching  elements,  and  instead  directly  derive  the  joint  loading 
probability  mass  function  for  a  set  of  channels  leading  from  a  switch. 

We  also  present  a  program  that  solves  these  equations  exactly.  As  was 
mentioned  in  the  introduction,  the  program  cannot  be  used  for  large  net¬ 
works.  as  its  running  time  grows  too  quickly.  We  have  found  in  the  literature 
no  polynomial-time  program  that  computes  the  exact  blocking  probability 
of  a  multipath  network.  It  may  be  that  the  problem  is  intractable,  although 
we  know  of  no  proof  of  A P-  or  #P-completeness  for  it. 

Chapter  5  describes  an  approximation  method  for  estimating  solutions 
to  the  equations,  bv  making  use  of  the  exact  solution  for  subproblems. 
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Figure  4.1:  An  8x8  deterministically-interwired  network  with  redundant 
paths.  There  are  a  number  of  different  paths  from  any  source  to  any  sink, 
to  increase  fault-tolerance:  redundant  paths  from  message  source  4  to  sink 
•4  are  highlighted.  Routing  is  oblivious,  with  stochastic  concentration.  This 
wiring  scheme  is  from  [2]. 

4.2  Extensions  to  the  Model 

Figure  4.1  depicts  a  multipath  network.  We  extend  our  model  so  that  sources 
can  have  more  than  one  channel  to  the  network.  A  source  still  generates  at 
most  one  message  per  cycle,  which  is  directed  to  a  stage  1  switch  via  one  of 
the  channels  connecting  the  source  to  the  network.  The  particular  channel 
is  selected  randomly  and  with  uniform  probability. 

As  before,  the  processes  generating  messages  at  the  sources  are  inde¬ 
pendent  and  memoryless.  With  some  specified  probability  p,.  each  source 
i  generates  or  fails  to  generate  a  single  message  at  the  beginning  of  each 
cycle.  The  network  is  synchronous:  at  each  cycle  messages  move  from  stage 
i  to  stage  /  +  1.  It  is  also  unbuffered:  if  a  message  is  blocked  at  some  stage, 
it  is  considered  to  be  lost,  and  does  not  in  any  way  affect  the  future  states 
of  the  system. 

We  use  dilated  switches,  as  described  in  Section  2.1.  so  that  the  set 
of  output  channels  of  a  switching  element  is  divided  into  nonempty  disjoint 
subsets  called  logical  directions.  At  each  cycle,  the  switching  element  directs 
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each  incoming  message  in  one  logical  direction.  As  for  Banyan  networks, 
we  can  choose  the  switching  probabilities  to  model  any  single  destination 
address  distribution.  When  we  route  messages  in  a  logical  direction,  we  use 
stochastic  conn  at  rat  ion: 

•  If  there  are  fewer  messages  or  exactly  the  same  number  of  messages 
directed  in  the  logical  direction  as  there  are  channels  in  that  logical 
direction,  then  the  channels  that  will  carry  the  messages  are  chosen 
randomly,  with  uniform  probability. 

•  If  there  are  more  messages  directed  in  a  logical  direction  than  there 
are  channels  in  that  direction,  the  messages  that  can  be  carried  are 
chosen  with  uniform  probability,  and  the  other  messages  are  blocked 
and  lost. 

We  note  again  that  our  network  is  self-routing:  each  message  contains 
the  information  necessary  to  route  the  message  from  the  source  where  it  is 
injected  to  the  sink  that  is  its  destination.  No  global  information  is  used.  In 
particular,  this  means  that  if  we  have  several  switches  at  a  single  stage,  then 
given  the  loads  on  their  input  channels,  the  loads  on  the  output  channels  of 
each  switch  are  independent  of  the  loads  on  the  output  channels  of  the  other 
switches.  This  fact  will  be  important  in  allowing  us  to  factor  joinl  loading 
probability  mass  functions. 

Having  extended  our  model,  let  us  return  to  the  network  of  Figure  4.1. 
The  switches  here  are  4  x  2.  dilation  2  switches,  except  at  the  last  stage, 
where  they  are  simply  2  x  2  (dilation  1)  switches.  In  the  4  X  2.  dilation  2 
switches,  the  top  two  output  channels  constitute  one  logical  direction,  and 
the  bottom  two  constitute  another. 

As  with  Banyan  networks,  we  wish  to  find  the  bandwidth  and  the  proba¬ 
bility  of  successful  message  transmission  of  the  networks  we  model.  We  find 
these  parameters  by  finding  the  probability  mass  functions  of  the  loads  on 
channels  leading  to  sinks. 

4.3  The  Joint  Probability  Mass  Function  of  an 
Aggregate  of  Channels 

Suppose  that  the  input  channels  of  a  switch  .S',  depicted  in  Figure  4.2.  are 

connected  to  several  switches  R j.  R2 . /?, .  Let  us  use  the  random  variable 

L  to  denote  the  entire  output  loading  configuration  of  S  at  some  specified 
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Figure  4.2:  Interstage  wiring.  Note  that  no  subset  of  the  channels  depicted 
need  have  mutually  independent  loads  in  a  network  with  redundant  paths. 
The  output  channels  on  the  right  of  the  switch  marked  ,V  are  those  whose 
loads  are  referred  to  collectively  in  the  text  with  the  random  variable  L. 

discrete  time  /.  so  that  P {L  —  /}  is  the  probability  that  the  output  channels 
of  the  switch  have  some  particular  loads  designated  in  their  aggregate  by  / 
during  cycle  t. 

Now  consider  the  loads  on  the  input  channels  C\\ . Cjw  at  cycle  /  -  1. 

(Because  we  assume  a  synchronous,  unbuffered  network  with  memoryless 
processes  generating  the  messages  at  the  inputs,  only  the  cycle  before  cycle 
t  is  of  interest.)  Let  us  denote  the  loads  on  the  input  channels  at  cycle  t  -  1 

with  the  random  variables  L(  n . Lc,u  - 

In  order  to  find  the  joint  probability  mass  function  of  the  loads  on  the 
output  channels  of  S.  we  condition  on  the  loads  on  the  input  channels: 

P{L  =  l}=  Y.  ?{L  =  I\  U  u  =  kn . Lc,u  =  Ic.J  ■ 

h"i  i . u 

Y{Un  =/r„ . Lc,„.  =!(■„.}  (4-1) 

where  the  sum  is  over  all  tuples  lCu . lCiti  with  elements  in  {0.1}. 

Suppose  that  we  can  compute  P {L  =  I  \  Lcn  =  k  u . Lc,u  = 

In  order  to  compute  the  probability  of  an  output  loading  configuration  of  S 
we  wall  still  need  to  find  the  joint  probability  mass  function  of  the  channel 

'An  expression  for  this  conditional  probability  is  derived  in  Section  4.4. 
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Figure  4.3:  Channels  referred  to  in  Equation  (4.2).  Although  the  probabil¬ 
ities  of  the  message  loads  on  the  channels  ('n . Ctw  are  not  in  general 

independent,  the  loads  on  the  subset  of  channels  from  <  :ich  switching  element 
are  independent  given  the  message  loads  on  the  input  channels  Bn . Blt. 

loads  Lcn, _ L(1U  .  In  a  Banyan  network,  it  would  be  easy  to  compute  this 

function:  it  would  simply  be  the  product  of  the  probability  mass  functions  of 
the  loads  on  the  individual  channels,  as  channel  loads  in  a  Banyan  network 
are  independent.2  In  a  network  with  redundant  paths,  however,  the  loads 
on  these  channels  are  not  in  general  independent,  as  they  may  derive  from 
ihe  same  sources,  and  a  message  from  a  single  source  that  has  traveled  one 
path  in  the  network  cannot  be  traveling  along  another  path.  Thus  another 
method  must  be  used. 

In  Figure  4.3.  we  see  that  the  input  channels  C'n . CIW  of  switch  S 

are  the  output  channels  of  switches  R\. - R,.  Let  us  call  the  loads  on 

the  input  channels  to  these  switches  . Lg,,.  We  may  now  calcu¬ 
late  P{Z<ru  =  k'n' _ Lc,a.  =  lc,u,}  by  conditioning  on  the  values  of  the 

variables  Zgn . Zgi(.  We  have 

HLcn  =  lctl . Lc,n.  =  lc,J  = 

X  pl^'n  =  lc„ . Lc,u  =  k\u  I  Lbu  =  lBu . Lb„  =  Ib„)  ' 

^11 . lB„ 

P{Zgn  =  /fin . LB,t  =  Ib„}  (  1-2) 

2See  Section  3.2. 
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where  t he  sum  is  over  all  tuples  /yn . Ib,,  with  elements  in  {0.  1}. 

The  loads  on  the  output  channels  of  these  switches  are  not  in  general 
mutually  independent.  However,  let  us  partition  them  intosubsets  according 
to  the  switch  at  which  they  originate,  so  that  for  the  channels  shown  in 
Figure  4.3  we  would  have  the  subsets 

{( 'n - (  iu}  -  K21 - ('h  } . K'ii . (  in  } 

Note  that,  under  the  assumption  of  independence  of  message  destinations. 

and  given  the  loads  on  the  channels  flu . fl,(.  the  loads  on  the  switch 

output  channel  subsets  are  mutually  independent.  As  mentioned  in  Sec¬ 
tion  4.2,  this  is  a  consequence  of  the  fact  that  the  networks  we  model  are 
self-routing.  No  global  information  is  used  in  routing  messages  through  the 
network. 

That  is.  if  we  know  the  input  loads  for  the  switches  R\ . R,.  then  the 

loading  probabilities  for  the  output  channels  of  each  of  the  switches  do  not 
depend  on  the  output  loads  of  any  other  switch.  We  may  use  this  fact  to 
derive  the  joint  probability  mass  function  of  the  loads  on  the  output  channels 
(T'u . ( ’,w  by  conditioning  on  the  input  channel  loads.  We  have  then 

p{£cn  =  k-u . LCtw  =  /(••„, }  = 

H  Pj^Cn  =k'u . £c,„=ic.Ju  I  LBu  =  iBn . LBu  = 

'fin . 'b.i 

?{Lc21  =  l(- 2, . Lc>,.=ic2r  |  Lb21  =  /fi21 . Lb2<  =  • 

P{^(„  =  /c„ . LClu,  =  k\u.  I  Lb„  =  . Lb„  =  Ib„}  ■ 

P{^Bii=^Bn . Lb„=Ib„}  ( 

where  the  sum  is  once  again  over  all  tuples  lgtI . /#if  with  elements  in 

{0.1}. 

The  subexpression  P {Lsn  =  ?gM . Lb,,  =  Ib,,}  can  be  evaluated  re¬ 
cursively  by  means  of  Equation  (4.3).  until  the  channels  flu . fli(  corre¬ 

spond  to  sources.  If  these  channels  originate  at  message  sources,  then  we 
substitute  instead  the  probability  mass  functions  corresponding  to  sources. 
We  may  simply  take  the  product  of  these  functions  for  the  sources  in  ques¬ 
tion.  as  in  our  model  the  processes  generating  messages  at  the  sources  are 
mutually  independent. 

If  source  i.  depicted  in  Figure  4.4.  generates  a  message  with  probability 
Pi  and  has  L  channels  into  the  network,  then  we  have  for  the  loads  on  the 
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Figure  4.1:  The  source  I,  generates  a  single  message  at  each  cycle  with 
probability  p, .  The  message  is  transmitted  with  uniform  probability  over  a 
randomly  picked  channel  in  the  set  {Fj . C\}. 


channels  (\ . the  joint  probability  mass  function 


P{/.r-J  =  /f . . LCk  =  lck}  = 


I  -  p,  if  all  the  If  are  0 

if  exactly  one  lc  is  1. 
and  the  rest  are  0 
0  otherwise 


It  remains  now  to  evaluate  the  conditional  probabilities  in  Equation  (4.4). 
Recall  that  these  are  the  joint  conditional  probability  that  some  subset  of 
the  output  channels  of  a  dilated  switch  have  a  particular  load,  given  that 
the  input  channels  have  a  particular  load.  We  derive  an  equation  for  ihese 
conditional  probabilities  in  the  next  section. 


4.4  Joint  Probability  Mass  Functions  of  Dilated 
Switch  Output  Channels 

Suppose  we  have  an  M  x  .V.  dilation  K  switch.  We  may  form  the  conditional 
probability  mass  function  of  the  loads  on  the  output  channels,  given  the 
input  load,  by  conditioning.  Say  that  the  random  variable  Lfg  represents 
the  load  on  the  </,h  channel  in  the  f'h  logical  direction. 

We  wish  to  evaluate  the  expression 

P{Fi.i  =  / i.i . F.v.a-  =  /. y.a-  |  Lc,  -  lc, . LCil  -  lcsi) 

For  an  event  E.  define 

(4.5) 


Q{£'}  =  P{ E  |  Lc,  =  !c. 


L(  .u  =  a;  } 
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01' course  Q{A}.  like  P{A  |  L(  ,  =  A, . L(  u  =  /(,,}.  is  a  probability  in 

the  usual  sense:  the  definition  is  used  to  make  completely  clear  the  signifi¬ 
cance  of  the  further  conditioning  we  perform  below.  We  will  condition  on 
the  number  of  messages  directed  in  each  logical  direction.  If  the  random 
variable  (',  represents  the  number  of  messages  routed  in  logical  direction  i. 
we  have: 


QUi.i  =  /  it . L  v.a  =  l\x}  = 

QUi.1  =  h.1 . L.X*  =  I XX  I  lh  =  <h . Ds  =  d„)  • 

A . 'is 

Q{D|  =  <lx . Ds  =  <l„ }  (  b(i) 

where  the  sum  is  over  all  .V-tu|)les  dx . d\  such  that  each  (I,  >  0  and 

E;=i  di  =  j. 

Now  we  consider  the  switching  probability.  We  calculate  the  probabil¬ 
ities  for  the  .V  logical  directions  using  Equation  (T7)  of  Section  :f.(i  (of 
course,  under  uniform  addressing  each  of  these  probabilities  would  be  l/.Y). 

Suppose  that  these  probabilities  are  r/,.q> . c/\.  By  our  assumption  of 

independence  of  message  addresses,  the  probability  that  of  the  Y1,Li  h\  ar¬ 
riving  messages.  d\  are  directed  in  direction  1.  d 2  in  direction  2.  and  soon, 
is  simply  multinomial,  so  that 

Q{»1  =  <h . Ds  =  ds]  =  (■!•■) 

Now  let  usevaluateQIAi,!  =  / j.j . L\x-  =  l\.k  I  D\  =  r/, . D\  =  r/„}. 

Say  that  b ,  is  the  number  of  messages  output  in  direction  i:  that  is.  b,  = 
^^'_1  This  number  is  not  the  same  as  d:.  because  if  there  are  more  than 
h  messages  to  be  output  in  a  A’-wide  direction,  some  messages  are  dropped 
and  lost.  If  b,  messages  are  output,  then  under  stochastic  concentration  the 
channels  are  picked  with  uniform  probability,  and  so  the  probability  of  any 
single  configuration  will  be  1  / ( ^ ) .  Thus 

QUt.i  =/..«••• 

< 

where  b,  =  YJ^\  h,j- 


■  I'S.k  —  I. xx  I  A  =  <h . Ds  =  (l\)  = 

)  if  for  any  i.  b,  ±  min  (d,.  I\) 


S 


otherwise 

i=t  UJ 


(4.X) 
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( 'ombining  Equations  (1.’)).  (  l.b).  1  I  T  /.  and  (  l.N).  we  have 


where  b,  =  XI !i= !  /,.7.  and  the  sum  is  over  the  \  -tuples  d\ . tl\  such  that 

for  each  d,.  min(r/,.  A  )  =  b,.  and  XI,  =  i  (h  =  XI, 'it  ^ 

Of  course,  if  tiie  conditional  joint  probability  of  the  load  on  a  subset 
of  the  switch's  output  channels  is  desired,  as  opposed  to  all  of  the  switch's 
channels,  we  can  simply  stun  this  expression  over  all  the  possible  loads  on 
the  complement  of  the  subset  of  channels  whose  loads  are  required,  as  the 
different  configurations  of  the  output  channels  are  mutually  exclusive  events. 


4.5  Automatic  Calculation  of  Blocking  Probabil¬ 
ities 

It  will  be  dear  that  the  automatic  calculation  of  blocking  probabilities  by 
this  means  will  require  a  great  deal  of  time.  Suppose  we  have  a  computer 
program  that  calculates  the  blocking  probabilities  for  a  network  in  the  most 
obvious  way.  by  finding  the  joint  probability  mass  function  of  the  channels 
leaving  the  final  stage,  using  Equation  (-1.3)  recursively.  In  the  worst  case,  we 
can  imagine  a  network  where  there  are  .V  stages  and  .\I  dependent  channels 
between  each  of  the  .V  stages,  and  the  joint  probability  mass  function  of  all 
of  the  channels  between  each  of  the  stages  must  be  formed.  The  domain 
of  the  joint  probability  mass  function  for  each  stage  then  is  of  size  2  . 
each  value  being  calculated  as  a  sum  over  terms.  Assuming  the  time 
to  calculate  each  of  the  terms  summed  over  in  Equation  (4.3)  is  0(4/).  we 
have  then  O^.V.U22',j  for  the  worst-case  performance. 

The  performance  on  some  networks  can  be  better  than  this,  however. 

Suppose  that  we  need  to  calculate  P{Z,r,  =  /(•, . /,<••„  =  /(•„}•  Let  S(f) 

denote  the  set  of  source  nodes  from  which  messages  can  reach  channel  c. 

If  we  can  partition  the  set  of  channels  {Lj . C„}  into  disjoint  subsets 

,Sj . Sm  such  that  for  any  (\  €  S,  and  C>  €  .S' , .  i  /  j.  S  (L  j  )  fi  S  (C>)  is 

empty,  then  the  loads  on  the  channels  in  each  subset  S,  are  independent  of 
the  loads  on  the  channels  in  any  and  all  of  the  other  subsets  in  the  partition.'* 

As  rail  be  seen  from  the  argument  in  Section  5.2. 


(HARTER  I.  PERl'ORMASn:  OF  MIT.TIRATH  ARTWORKS 


II 


10 

11 

>2 

13 

14 

15 

•6 

I? 


Figure  4.5:  The  network  of  Figure  -1.1.  with  switches  labeled. 

Then  the  expression  R{Lc{  =  . L(-n  =  /(•„  }  can  be  factored  into  the 

product  of  /(/joint  probability  mass  functions,  one  for  each  subset  S, .  In  the 
limiting  case  of  a  Banyan  network,  a  complete  factoring  will  be  possible  for 
every  set  of  channels,  and  the  summation  itself  can  be  factored,  so  that  the 
worst  case  performance  for  a  Banyan  network  of  .V  stages  with  M  channels 
between  the  stages  becomes  Of.Y.l/). 

A  Common  LISP  program  has  been  written  to  evaluate  the  joint  prob¬ 
ability  mass  function  of  the  loads  on  specified  channels  in  a  multistage  in¬ 
terconnection  network.  The  program  is  given  a  symbolic  description  of  the 
interconnection  network:  this  requires  labeling  the  switching  nodes  of  the 
network.  We  show  a  labeling  of  the  network  of  Figure  4.1  nodes  in  Fig¬ 
ure  4.5.  The  input  description  for  this  network  is  shown  in  Figure  4.5. 

The  program  uses  the  network  representation  to  build  an  internal  struc¬ 
ture  in  which  (for  example)  information  about  independence  of  channel  loads 
has  been  pie-co.;inuted.  and  channels  have  been  assigned  names  generated 
from  the  names  of  the  their  nodes  of  origin  and  destination.  One  can  then 
query  the  program  for  the  probability  mass  function  of  interest.  The  result 
is  numerical,  as  in  t he  example  below: 

>  (setq  d8x8  (parse-multistage-network 

deterministically-interwired-8x8-rep)) 
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(def parameter  deterministically-intervired-8x8-rep 

;;  inputs  first  —  these  don’t  get  a  conditional  probability 
; ;  function . 

’((iO  (a  b)  nil  1/2)  (il  (a  b)  nil  1/2) 

(i2  (a  b)  nil  1/2)  (i3  (a  b)  nil  1/2) 

(i4  (c  d)  nil  1/2)  (i5  (c  d)  nil  1/2) 

( i6  (c  d)  nil  1/2)  (i7  (c  d)  nil  1/2) 

;;  stage  1  4x4 ’s 

(a  (e  f  g  h)  4x2d2-cp-fun)  (b  (e  f  g  h)  4x2d2-cp-fun) 

(c  (e  f  g  h)  4x2d2-cp-fun)  (d  (e  f  g  h)  4x2d2-cp-fun) 

; ;  stage  2  4x4’ s 

(e  (ttO  ttl  tt2  tt3)  4x2d2-cp-fun) 

(f  (ttO  ttl  tt2  tt3)  4x2d2-cp-fun) 

(g  (tt4  tt5  tt6  tt7)  4x2d2-cp-fun) 

(h  (tt4  tt5  tt6  tt7)  4x2d2-cp-fun) 

;;  stage  3  2x2 ’s 

(ttO  (oO  ol)  2x2dl-cp-fun)  (ttl  (oO  ol)  2x2dl-cp-fun) 

(tt2  (o2  o3)  2x2dl-cp-fun)  (tt3  (o2  o3)  2x2dl-cp-fun) 

(tt4  (o4  o5)  2x2dl-cp-fun)  (tt5  (o4  o5)  2x2dl-cp-fun) 

(tt6  (06  o7)  2x2dl-cp-fun)  (tt7  (06  o7)  2x2dl-cp-fun) 

; ;  outputs 

(oO)  (ol)  (o2)  (o3)  (o4)  (o5)  (06)  (o7))) 


Figure  4.(5:  Symbolic  description  of  the  network  of  Figure  4.5.  The  descrip¬ 
tion  specifies  that  during  each  cycle  each  source  node  generates  a  message 
with  probability  1/2. 
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#<MULTISTAGE-NETWORK  8x8> 

>  (jlpmf  ’(tt6-o7-0  tt7-o7-0)  d8x8) 

(#S( JLPMF-PART  CHANNELS  (#<CHANNEL  TT6-07-0> 

#<CHANNEL  TT7-07-0>) 

NUMBER-OF-CHANNELS  2 
VECTOR  #(10321939817/17179869184 
2931771091/17179869184 
2931771091/17179869184 
994387185/17179869184) ) ) 

Here  we  have  calculated  the  joint  probability  mass  function  of  the  loads 
on  two  channels  leading  from  two  2  x  2  switches  to  sink  07  in  the  network 
of  Figure  4.1,  given  a  probability  of  transmission  in  each  message  source 
of  1/2.  and  under  a  uniform  destination  address  distribution.  The  vector 
component  of  the  structure  result  above  is  indexed  by  integers  in  which  the 
bit  with  weight  2'  specifies  the  load  of  the  /tli  channel  (starting  with  /  =  0) 
in  the  vector  of  channels  whose  joint  loading  probability  inass  function  was 
required.  Thus  in  the  example  above,  the  probability  that  no  messages  are 
transmitted  to  sink  07  is  10321939817/17179869184  as  0.601:  the  probabil¬ 
ity  that  1  message  is  transmitted  along  the  channel  from  switch  TT7  to  07 
is  2931771091/17179869184  %  0.171.  as  is  the  probability  that  1  message  is 
transmitted  along  the  channel  to  07  from  switch  7T6.  Finally,  the  probabil¬ 
ity  that  both  channels  carry  a  message  is  994387185/17179869184  %  0.058: 
we  assume  here,  as  in  [2].  that  a  message  sink  can  receive  two  messages 
during  a  single  cycle.4 

To  find  the  blocking  probability  of  the  network,  we  use  Equation  (2.1): 
we  form  the  probability  of  successful  message  transmission  as  the  ratio  of  the 
expected  number  of  messages  entering  the  network  to  the  expected  number 
of  messages  arriving  at  sinks.  Because  of  the  symmetry  of  the  network,  all 
the  channels  leading  to  sinks  have  identical  loading  probabilities,  and  so 
we  can  simply  sum  the  expectations  of  their  loads.  We  have  then  that  the 
expected  number  of  messages  arriving  at  a  single  sink  is 

i  2931771091  2931771091  ^  994387185  981539569 

17179869184  +  1  ’  1 7 1 79869 1 84  +  2 ’  17179*69184  “  2147483648  ~ 

1  If  a  sink  can  receive  only  one  message  during  a  cycle,  then  the  expected  number  of 
messages  received  by  a  sink  during  a  cycle  will  be 

1 032 1 9198 1 7  685 792936  7 

1 - =  -  ft  0.40 

17179869184  17179869184 
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P{Success} 


Figure  -1.7:  The  probability  of  successful  message  transmission  (P{Success|) 
plotted  against  the  the  source  transmission  probability  (Pi)  for  the  network 
of  Figure  4.1.  under  a  uniform  destination  address  distribution. 


and  the  expected  number  of  messages  arriving  at  all  sinks  during  any  cycle 

,  981539569  981539569 

8  - -  =  - w  3.66 

2147483648  268435456 

Because  the  expected  number  of  messages  entering  the  network  is  8  •  L  =  4. 
we  have  that  the  aggregate  probability  of  successful  message  transmission 
in  this  network  at  a  loading  factor  of  1/2  is 


E[messages  arriving  at  sinks]  981539569  ^ 

E[messages  injected  by  sources]  1073741824 

and  thus  the  blocking  probability  is  approximately  0.086. 

We  plot  for  the  network  of  Figure  4.1  the  probability  of  successful  mes¬ 
sage  transmission  versus  the  probability  that  a  source  transmits  in  Fig¬ 
ure  4.7. 

The  Common  LISP  implementation  internally  records  joint  probability 
mass  functions  so  that  they  need  not  be  recomputed.  The  implementation 
has  been  coded  with  some  attention  to  performance,  because,  although  the 
asymptotic  performance  is  pessimal,  the  same  code  is  used  on  subnetworks 
of  larger  networks  in  an  approximation  scheme. 
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Figure  4.8:  A  16  x  16  network  with  random  interwiring  in  the  first  and 
second  stages.  The  figure  is  from  [2]. 

4.6  Applicability  of  Exact  Calculation  of  Block¬ 
ing  Probabilities 

We  have  presented  a  means  of  exact  calculation  of  the  blocking  probability 
of  a  multistage  network  with  redundant  paths,  and  demonstrated  its  use  in 
a  program  that  automatically  calculates  blocking  probabilities  and  exploits 
independence  of  channel  loading  probabilities  where  this  is  possible. 

The  implementation  described  cannot  be  used  to  calculate  the  blocking 
probabilities  of  networks  with  much  more  path  redundancy  than  the  one  of 
Figure  4.1.  We  might  consider  an  implementation  that  could  exploit  the 
symmetry  exhibited  by  some  multistage  networks,  but  such  an  implemen¬ 
tation  could  still  not  be  used  on  a  network  like  that  in  Figure  4.8,  in  which 
the  wiring  in  the  first  and  second  stages  is  not  symmetric  and  is  in  fact 
randomly  generated.  That  such  networks  are  of  interest  is  demonstrated  in 
[16], 

Thus  we  must  seek  approximate  solutions  to  the  problem.  This  we  do 
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iu  t lie  next  chapter,  where  we  will  see  that  the  exact  equations  and  our 
algorithm  for  solving  them  can  be  used  to  realize  a  faster  approximation 
method. 


Chapter  5 

Approximating  Performance 
Parameters  for  Multipath 
Networks 


5.1  Introduction 

We  saw  in  the  previous  chapter  that  exact  calculation  of  the  probability 
mass  functions  of  channels  leading  to  sinks  in  a  multipath  network  could 
be  very  expensive.  In  this  chapter  we  seek  a  method  of  approximation  of 
performance  parameters  that  will  allow  us  to  estimate  to  within  a  given 
error  the  loading  probability  of  a  channel  leading  to  a  sink.  We  will  do 
this  by  using  Monte  Carlo  methods,  attempting  both  direct  simulation  of 
the  network  and  also  approximation  of  Equation  ( 4.3 ).  and  compare  the 
expense  and  error  of  the  two  methods. 

Our  approximations  use  exactly  the  model  we  described  in  Sections  2.1 
and  4.2.  We  will  find  that  this  exact  correspondence  is  important  as  we 
develop  a  method  of  approximating  solutions  to  the  equations  by  a  combi¬ 
nation  of  simulation  and  exact  methods. 


5.2  Direct  Simulation 


In  direct  simulation,  we  simulate  the  transition  through  the  network  of  a 
group  of  messages  generated  in  a  single  cycle  as  follows: 
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1.  Messages  are  generated  for  tlie  cycle  being  simulated  by  sources  in 
accordance  with  the  source  transmission  probabilities  p,. 

2.  Addresses  are  picked  according  to  the  destination  address  distribution. 

3.  I'lie  messages  arrive  at  switching  elements  and  are  directed  in  logical 
directions  in  accordance  with  their  addresses. 

•  The  direction  of  more  messages  in  a  logical  direction  than  there 
are  channels  in  the  direction  is  resolved  by  randomly  choosing 
messages  are  blocked. 

•  Output  channels  within  a  logical  direction  are  selected  randomly, 
with  each  channel  having  the  same  probability  of  being  selected 
to  carry  a  message. 

4.  Step  3  is  repeated  until  we  have  calculated  the  loads  of  the  channels 
whose  states  we  are  examining  in  the  simulation. 

Note  that,  using  the  results  of  Section  3.C.  we  can  generate  the  same 
distribution  of  messages  as  we  do  in  step  2  by  modifying  step  3  to  randomly 
pick,  for  each  message,  a  logical  direction  in  accordance  with  the  switching 
probabilities  of  the  switch.  Our  simulation  algorithm  then  becomes: 

1.  Messages  are  generated  for  the  cycle  being  simulated  by  sources  in 
accordance  with  the  source  transmission  probabilities  p,. 

2.  The  messages  arrive  at  switching  elements  and  are  directed  in  logical 
directions  in  accordance  with  the  switching  probabilities  of  the  switch. 

•  The  direction  of  more  messages  in  a  logical  direction  than  there 
are  channels  in  the  direction  is  resolved  by  truncating  the  number 
of  messages  to  the  dilation  of  the  logical  direction. 

•  Output  channels  within  a  logical  direction  are  selected  randomly, 
with  each  channel  having  the  same  probability  of  being  selected 
to  carry  a  message. 

3.  Step  2  is  repealed  until  we  have  calculated  the  loads  of  the  channels 
whose  states  we  are  examining  in  Tie  simulation. 

Simulation  procedures  for  the  random  selections  described  above  are 
straightforward.  We  describe  them  briefly  here:  details  of  these  techniques 
can  be  found  in  introductory  texts  on  probability  models  ( ( [25]). 
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Message  generation  is  performed  by  simulating  the  generation  of  a  Bernoulli 
random  variable  with  the  source  transmission  probability.  Selection  of  logi¬ 
cal  directions  for  a  message  can  be  performed  by  subdividing  the  half-open 
interval  [0. 1 )  into  as  many  segments  as  there  are  logical  directions,  the  length 
of  the  segment  for  a  logical  direction  being  the  same  as  t he  probability  of 
selecting  that  direction.  A  uniform  random  variable  [  is  generated  and  the 
segment  into  which  /  falls  is  taken  as  corresponding  to  the  selected  logical 
direction.  Finally,  the  random  selection  of  output  channels  within  a  logical 
direction  can  be  performed  in  many  ways;  we  do  so  by  considering  the  A 
channels  to  correspond  to  bits  in  a  A- bit  vector.  If  there  are  »  messages  to 
be  directed  in  the  logical  direction,  we  set  only  the  low  it  bits  in  the  vector 
and  then  randomly  permute  the  vector,  which  can  be  done  in  0(A  )  steps.1 
The  bits  that  are  set  after  the  permutation  correspond  to  the  channels  that 
carry  messages. 


5.3  Approximation  of  Performance  Parameters 
Using  Direct  Simulation 

Repeated  simulations  can  be  used  to  approximate  the  parameter  of  interest 
by  the  Monte  Carlo  method.  Suppose  that  what  we  are  interested  in  is  the 
probability  that  some  set  of  channels  C  has  a  particular  loading  configuration 
/.  We  run  some  number  N  of  simulations,  examining  after  each  simulation 
the  loads  on  the  channels  C'.  If  the  channels  have  the  loading  configuration 
/.  the  experiment  is  considered  a  “hit"  and  has  value  1.  If  the  channels  do 
not  have  the  loading  configuration  /.  the  experiment  is  a  "miss”  with  value 
0.  The  mean  of  the  values  of  the  experiment  is  taken  as  an  approximation 
of  the  expected  load. 

Now  we  describe  direct  simulation  using  standard  notation,  as  the  tex¬ 
tual  description  above  would  prove  too  unwieldy  later  in  the  chapter.2 

Let  . rk  denote  all  the  random  variates  that  might  be  required 

to  perform  a  single  direct  simulation  by  the  algorithm  described  above.3 

Then  let  R  =  ( /q .  r> . r*. )  be  a  vector  of  these  random  variates.  Now  let 

R,  .R2 . R„  be  a  sequence  of  such  vectors,  identically  and  independently 

distributed. 

’Using  an  algorithm  on  pp.  47-1-47 6  of  [25]. 

2\Ve  use  the  notation  of  [10]. 

’That  the  number  of  random  variates  that  might  be  required  is  finite  will  be  clear  when 
we  consider  that  only  a  finite  number  of  outcomes  from  each  experiment  is  possible. 
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If  f  (R)  is  a  function  whose  value  is  1  where  the  channels  ( '  whose  states 
we  are  examining  in  simulation  have  the  load  /.  and  0  where  they  do  not. 
then  the  variables 

f,  =  /( Ri) 

are  identically  and  independently  distributed.  If  E[/(R)]  =  //.  then 

It 

i=l 

is  an  unbiased  estimator  of  P{ £(  =/}  =  //. 

In  order  to  calculate  error  bounds  on  our  approximations,  we  will  need 
to  know  the  variance  of  /.  Because  the  /,  are  Bernoulli. 

Var  (/)  =  j'ft  ( 1  -  //) 

because /i  ( 1 -/< )  is  the  variance  of  /  ( R ).  Unfortunately,  this  expression 
will  not  be  very  useful  in  practice,  as  we  do  not  a  priori  know  //.  or  there 
would  be  no  need  to  estimate  it.  Thus  we  estimate  the  variance  of  / (R). 
using  the  unbiased  estimator 

*2  =  rrrE  (/<  -  /) 2  «*Var(/(Rj) 

There  are  means  of  estimating  the  variance  of  s2,  but  we  will  not  use 
these,  as  in  practice  the  variance  is  small,  and  our  error  bounds  are  conser¬ 
vative. 

Given  the  estimate  s2  for  Var (/ ( R ) ).  we  may  estimate  the  variance  of 
/  as 

> 

\  ar(/)  * 

vielding  a  standard  error  of  s/\/n.  which  shows  clearly  that  the  error  will 
vary  as  the  inverse  roof  of  the  number  of  trials. 


5.4  Bounding  the  Number  of  Iterations 

To  bound  the  number  of  iterations  for  which  our  simulation  must  run  in 
order  to  achieve  a  specified  level  of  precision  at  a  specified  confidence,  we 
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can  use  t  lie  Chebyshev  Ine<i ualit y.  which  states  t  hat  if  A  is  a  random  variable 
vvitli  mean  //  and  variance  t>~ .  t  lieu 

> 

f*  { I  -V  -  // 1  >  A  }  <  ~ 


Call  the  number  of  iterations  performed  //.  Suppose  that  we  wish  to  bound 
by  c  the  probability  that  our  estimate  /  deviates  from  the  value  ft  being 

estimated  bv  more  than  some  fraction  <1  of//.  Because  the  variance  of  /  C 

) 

—  we  have 


”{|/  -  "j  > 

1 1/- /'I  ' 


>  (I )  < 


>t<l2  ft - 


(5.i : 


Now  we  can  estimate  the  number  of  iterations  we  require  by  considering  c. 
the  complement  of  our  desired  confidence  level: 


c  — 


it  = 


ltd2  ft2 

a2 

cd2((2 


5.2) 


In  practice  we  use  the  estimate  .s2  for  u2  and  the  estimate  /  for  //  in  cal¬ 
culating  a  projected  number  of  iterations.  We  repeat  the  calculation  after 
each  iteration  of  the  algorithm  and  check  to  see  whether  we  have  performed 
enough  iterations  to  bound  the  error  as  desired. 

The  Chebyshev  Inequality  provides  a  conservative  bound  on  the  num¬ 
ber  of  iterations  required.  For  large  numbers  it  of  simulations,  we  expect 
from  the  Central  Limit  Theorem  that  the  distribution  of  /  is  approximately 
nonnal.  Thus,  for  example,  we  may  have  9r)(X  confidence  that  /  is  in  the 

.  We  can  use  the  Central  Limit  Theorem  in  the 
same  way  that  we  did  the  Chebyshev  Inequality,  to  calculate  a  projected 
number  of  iterations  required  to  bound  the  error  as  desired. 

By  the  Central  Limit  Theorem  we  have  that 


interval  //  — 


,fl  +  h  +  ■  ■  ■  +  In  ~  »/< 
ay/Ji 


<  (t  >  —  <F(n  ) 


as  it  —  oc 
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so  that 


/-/‘I 


< 


(l(T 


—  <!>(«)- 4>(-«) 


=  24»(« )  -  1  as  n  —  x 


5.3) 


Substituting  d  for  and  taking  the  complement,  we  have  then 

..  f  I/- 


>d  \  -  2  1 


as  n  —  x 


15.4) 


where  as  before  we  use  the  estimate  s  for  a  and  the  estimate  /  for  //  in 
practice.  If  we  wish  to  bound  by  c  the  probability  that  /  varies  from  the 
desired  result  by  more  than  d.  we  may  use  our  formula  by  calculating  afler 
each  iteration  of  the  simulation  the  quantity  2^1-  $  (  )  )  and  halting 

when  it  is  less  than  c.4 


5.5  An  Example  of  Direct  Simulation 

A  program  has  been  written  to  est  imate  the  probability  that  a  set  of  channels 
in  a  network  will  have  a  particular  load,  using  the  simulation  algorithm  of 
Section  5.2.  Although  simulation  will  let  us  estimate  blocking  probabilities 
for  larger  networks,  and  we  will  use  a  larger  network  later  in  the  chapter,  here 
we  use  the  network  of  Figure  4.1.  reproduced  here  in  Figure  5.1.  We  do  so 
because  we  know  an  exact  result  for  this  network  (as  shown  in  Section  4.5). 
and  thus  we  can  verify  that  in  this  example  the  simulation  algorit  hm  achieves 
the  error  bounds  it  should. 

We  will  estimate  the  probability  that  both  of  the  channels  leading  to 
sink  7  in  this  network  carry  no  messages.  We  had  determined  in  Section  4.5 
that  this  probability  (under  uniform  addressing,  with  each  source  having  a 
probability  of  0.5  of  generating  a  message  at  each  cycle)  was 


P{L7T«_ot-o 


0.  L'/tt-OT-U  =  0}  = 


10521939X17 
17179X091 XI 


0.000X 


4Of  course,  vve  could  also  use  tlie  inverse  function  <!>  1  to  allow  us  to  project  a  number 
of  iterations:  but  we  have  not  done  this  here. 
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Figure  5.1:  The  8x8  deterministically-interwired  network  of  Figure  4.1.  In 
the  example  we  estimate  the  probability  that  the  channels  leading  to  sink  7 
carry  no  messages,  under  uniform  addressing. 


We  see  in  Figure  5.2  the  result  of  running  the  program  to  estimate  the  re¬ 
quired  probability,  using  Equation  (5.4)  to  calculate  the  number  of  iterations 
necessary  to  achieve  an  estimate  that  lies  within  l(Z  of  the  actual  value  with 
95 %  confidence. 

We  see  that  approximately  25.000  iterations  are  required  to  estimate  the 


value 


15222 

25211 


0.004 


which  is  indeed  within  1  of  the  exact  solution.  Using  the  more  conservative 
bound  of  Equation  (5.2).  the  simulation  runs  for  about  133.000  iterations, 
yielding  a  result  of  79825/132847  %  0.0009. 


5.5.1  The  Expense  of  Direct  Simulation 

For  a  network  with  .V  stages  with  M  channels  between  each  stage,  an  itera¬ 
tion  of  the  direct  simulation  algorithm  of  Section  5.2  runs  in  time  0  (AM). 
We  can  use  Equation  (5.2)  to  bound  the  total  cost  of  estimating  //  with  a 
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>  (setq  o7-channels  (elements-named  ’(tt6-o7-0  tt7-o7-0)  d8x8)) 
(#<CHANNEL  TT6-07-0>  #<CHAMMEL  TT7-07-0>) 

>  (simulate-multi-channel-loading-probability  d8x8  o7-channels  ’(0  0) 

(make-clt-stopping-f unction  .01  .05  5000)) 

Iteration  15;  mean:  .667;  variance:  .238;  current  confidence  .042 


Iteration 

5000; 

mean : 

.607; 

variance : 

.239; 

current 

confidence  0 

.62 

Iter at ion 

10000; 

mean : 

.603; 

variance : 

.  239 ; 

current 

confidence 

.782 

Iteration 

15000; 

mean : 

.605; 

variance : 

.239; 

current 

confidence 

.871 

Iteration 

20000; 

mean : 

.603; 

variance : 

.239; 

current 

conf idence 

.919 

Iteration 

25000 ; 

mean: 

.  604 ; 

variance : 

.239; 

current 

confidence 

.949 

15222/25211 

76026279/317734655 

25211 

Figure  5.2:  Estimating  by  direct  simulation  the  probability  that  both  chan¬ 
nels  leading  to  sink  7  in  the  network  of  Figure  5.1  carry  no  messages,  under 
uniform  addressing  and  with  a  source  transmission  probability  of  1/2. 


deviation  factor  of  il  and  at  a  confidence  of  1  -  c  as 


°(vlW 


=  ()(.Y  M 


d 

C(l2fl  ) 


because  in  the  Bernoulli  trials  that  make  up  the  iterations  of  a  direct  simu¬ 
lation  we  have  a1  =  //  ( 1  —  //). 


5.6  Approximating  a  Solution  to  the  Exact  Equa¬ 
tions 

5.6.1  Approximating  Equation  (4.2)  Across  a  Single  Stage 

VVe  saw  in  Chapter  1  that  our  method  of  exact  calculation  of  blocking 
probabilities  suffered  from  exponential  increase  in  the  expense  of  calculation 
as  the  number  of  dependent  paths  between  stages  increased.  Equation  (-1.2) 
specified  the  probability  that  a  set  of  output  channels  of  switches,  depicted 
in  Figure  5.5.  carried  a  particular  load: 
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IM/r,  =  T  , . Lrk  -  I (\]  = 

E  =  '<5 . =  /rA  |  /-«,  =  /ft . k=/«J  ■ 

ft, . tfl,„ 

P{/-ft  = /ft . /-ft,,  =  /«,„}  (r>-5) 

when'  the  sum  is  over  all  tuples  /g,„  with  elements  in  {().  1}. 

A  method  of  approximate  solution  of  this  equation  that  suggests  itself 
immediately  is  one  of  the  following  form: 

Rather  than  calculating  this  sum  over  all  tuples  /g, . IB,„. 

calculate  it  exactly  for  only  some  of  the  tuples. 

To  he  more  precise,  suppose  that  we  define 

</(/ft . /ft,,)  = 

P{L<i  =  l(\ . =  lck  I  Tft  =  /g, . Lb,„  =  /ft,,} 

and  we  generate  tuples  1  g, . /gm  randomly  in  accordance  with  the  prob¬ 
ability  mass  function  P{/,g,  =  /g, . Eg,,,  =  /g,„}. 

Now  <y(/g, . /g,„ )  is  a  random  variable,  and  its  expectation  is 

E[0(/]9, . fBm)} 

=  E  . /ftJPUft  =  /ft . Eft,,  =  /ft,,} 

1b , . 

=  E  =/(•, . ,  = /(  ,  I  Lg,  = /g, . />g,„=/g„,}- 

;Bl . 'Sm 

PU'ft  =  /ft . Lnm  =  //?,„} 

=  P^r,  =/r, . ^  =/rJ 

Thus  we  see  that  fy  { / g , . /g,„  )  is  an  unbiased  estimator  of  the  probability 

we  wished  to  estimate:  P {/,<••,  -  Cy . Try  =  /c  A } - 

We  can  readily  calculate 

PU'Ci  =  !(-, . LCk  =  L\  |  Lb ,  =  /«, . Egm  =  /ft,,} 

by  factoring  the  expression  as  in  Equation  ( 4 . A ) .  As  we  observed  in  Chap¬ 
ter  1.  although  the  loads  on  the  channels  C| . ( V  are  not  in  general 

independent,  the  loads  on  the  subset  of  channels  originating  at  each  switch¬ 
ing  element  are  independent  given  the  message  loads  on  the  input  channels 

n , . 
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Figure  5.4:  The  solid  box  shows  the  stages  of  the  network  for  which  the 
estimator  <j  performs  an  exact  calculation:  the  dotted  box  shows  the  stages 
of  the  network  for  which  h  will  perform  an  exact  calculation. 

5.6.2  Approximating  Equation  (4.2)  Across  Multiple  Stages 

We  have  then  an  estimator  for  the  probability  that  a  sel  of  channels  at 
some  stage  in  the  network  carries  a  particular  load.  We  may  estimate  the 
value  of  this  probability  by  generating,  in  accordance  with  the  appropriate 
probability  distribution,  input  loads  for  the  switches  at  which  the  channels 
originate.  It  occurs  now  to  ask  whether  we  might  be  able  to  extend  the 
estimation  technique  to  cover  more  than  one  stage  of  the  network. 

The  situation  is  as  depicted  in  Figure  5.4.  We  have  an  estimator  <)  1  hat 

will  allow  us  to  estimate  the  probability  of  loads  on  the  channels  C\ . C\-. 

if  we  generate  the  loads  on  the  input  channels  B\ . Bm.  We  require  an 

estimator  h  that  will  allow  us  to  estimate  the  probability  of  loads  on  the 
channels  Oj . 0„.  by  generating  the  loads  B i . Bn,. 

The  estimator  /;(/#, . /#m)  "ill  m  fact  simply  be 

11  i'fh . lB,„)  = 

FM  /-o,  =Io, . Lo„  =  lo„  I  Ffi,  =  /«, . I-B,„  -  Ib,„  }  (5.0) 
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which,  by  an  argument  identical  to  that  for  ij.  will  be  an  unbiased  estimator 

oiP\L0l  =  lo, . Lo„  =lo„}. 

To  evaluate  the  conditional  probability,  we  define 

Q  {t:}  =  p\e\lHi  =  /«, . =IbJ 

If  the  input  channels  to  the  final-stage  switches  are  called  l)x . 1),. 

we  now  have 

Q{Loi  =  lo . Lo„  -  lo,,}  = 

=  l Ox . Lon  =  l(>„  I  T})k  -  l/jl . LU)  -  lDj  |  • 

b>i . 1 Pj 

Q  {/-Oi  =  Id , . Ldj  =  //),}  (-">•“) 

which  is  similar  to  Equation  (4.2).  Note  that 

Q {Lo,  =  /o, . =  /o„  I  =  Id, . Tdj  =  Idj  }  = 

pjioi  =  lo) . £o„  =  /o„  I  Lay  =  Id ) . =  /p,} 

because,  given  the  loads  on  the  input  channels  D\ . D,.  the  loading 

probabilities  on  the  channels  Oj . 0„  are  independent  of  the  loads  on 

B\ . Bm.  so  long  as  these  are  distinct  from  D\ . Dj.  Thus  the  condi¬ 

tional  probability  inside  the  summation  can  be  factored  in  the  same  fashion 
as  that  in  Equation  (4.2). 

We  can  evaluate  the  term 

Q{LDi  =  Idi . Id,  -  Id,  } 

using  Equation  (5.7)  recursively,  just  as  we  did  with  Equation  (4.2).  In  fact. 

the  only  [joint  at  which  the  evaluation  of  //(/#, . Ib„,  )  will  differ  from 

that  of  a  network  comes  when  the  channels  D\ . D,  correspond  to  the 

channels  B ] . B,„.  At  this  [joint  we  will  be  evaluating 

Q !  lB]  =  i'Bl . Lb,„  =  i'b„,  !  = 

P{  Lb,  =  I’d, . Lb,,,  =  I'h„,  I  Lb,  =  / b, . LB„,  =  lB„,  1 

which  will  be  1  only  when 

In  i  =  I'b , . Ib„,  =  I'b,„ 
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and  will  be  0  otherwise. 

It  is  interesting  to  note  that  this  last  expression  can  thus  In  factored!  a.-. 


Q  Ub.  =  '« 

P{'ib  : 


'b.  I 


Lb,„ 

l‘b\ 


/'  >  - 

/ft,!  =/'«,„  |  L0m  =/«,„ 


which  demonstrates  that,  given  t he  input  loads,  the  individual  channel  prob¬ 
abilities  are  independent.  In  particular,  we  see  that  in  evaluating // (/g, . /ft,,, 

we  may  treat  the  channels  B\ . B,„  as  the  source  channels  of  a  network 

the  sources  of  which  have  transmission  probability  0  when  /ft  =  0.  and 
transmission  probability  1  when  Ib,  ~  1. 

That  is.  we  see  from  Equation  (-1.4)  that  a  channel  leading  from  a  source 
that  transmits  with  probability  /ft,  has  a  loading  probability  mass  function 


P  {Lb, 


1  -Ib, 
Ib, 


i  trB. 

^  I'b, 


is  0 
is  1 


(5.X) 


which,  because  Ib,  and  VB  can  only  be  0  or  1.  is  the  same  as 


p{Lb,=('b,\Lb,=Ib,} 


Therefore  we  see  that  we  may  evaluate  the  conditional  probability  that 
is  the  definition  of  the  estimator  h  by  means  of  recursive  application  of 

Equation  (4.4)  with  a  network  whose  sources  /ft, . /gm  are  connected  to 

channels  B\ . Bm.  Source  Ib,  has  source  transmission  probability  0  when 

I B,  =  0.  and  source  transmission  probability  1  when  Ib,  —  1. 

Thus  a  scheme  for  approximating  Equation  (4.2)  is  to  pick  a  stage  at 
which  to  divide  the  network,  and  solve  the  network  to  the  right  of  it  exactly, 
given  source  transmission  probabilities  equal  to  loads  that  we  generate  with 
probabilities  given  by  the  joint  probability  mass  function  of  the  channels 
where  the  division  was  made.  This  yields  a  sample  value  of  the  unbiased 

estimator  /f(/o, . /o„  )•  whose  expectation  we  may  evaluate  by  a  Monte 

Carlo  method. 


5.6.3  Generating  Random  Variates  from  the  Joint  Probabil¬ 
ity  Mass  Function  P {LBl  =  /g, . Lb„,  =  Ib,,,} 

It  remains  to  describe  a  method  of  generating  random  tuples  /g, _ /g,ti  in 

accordance  with  the  probability  mass  function  P{  Ag,  =  / g, . Lb„,  —  Ib„,  }• 
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The  method  is  straightforward:  we  simply  simulate  the  network  using  the 
method  of  Section  5.2.  and  use  the  channel  loads  generated  by  the  simula¬ 
tion.  Because  we  were  careful  that  our  simulation  would  correspond  exactly 
to  the  equations,  the  random  variates  generated  this  way  will  have  the  mass 

function  P{/.B,  =  lBx . -  /»,„}• 

Thus  we  see  that  one  method  of  approximate  solution  of  the  exact  equa¬ 
tions  corresponds  to  combining  simulation  and  exact  calculation.  In  fact, 
looked  at  another  way.  solving  for  the  loading  probabilities  of  the  subnetwork 
made  up  of  the  later  stages  is  simply  a  means  of  reducing  the  variance  of  the 

simulation,  because,  as  we  shall  see.  h(lol . Iq„)  will  always  have  lower 

variance  than  the  corresponding  Bernoulli  variable  in  direct  simulation. 


5.7  Examples  of  Approximation  of  the  Exact  Equa¬ 
tions 

A  program  has  been  written  to  use  the  approximation  method  described  in 
the  previous  section.  We  will  first  examine  some  details  of  the  performance 
of  the  method  by  considering  some  examples  in  detail.  Then  we  will  use  the 
techniques  we  have  described  to  compare  the  performance  of  three  networks. 


5.7.1  Performance  of  the  approximation  method  on  some 
simple  examples 


For  a  first  example,  let  us  consider  the  familiar  network  depicted  in  Fig¬ 
ure  (5.5).  Here  the  estimator  h  is  used  for  only  the  final  stage  of  the  network. 
The  resulting  run  is  shown  in  Figure  5.6. 

We  see  that  about  11.000  iterations  are  required  to  estimate  a  loading 
probability  of 


100811 

168112 


0.5997 


as  compared  to  about  25.000  iterations  for  the  same  error  bound  by  di¬ 
rect  simulation.  The  reason  for  the  difference  is  directly  evident  when  we 
compare  the  variance  of /( R)  to  that  of  b(/o, . b„)‘ 


Var(/ (R ) )  ss  0.239  but  Var  (/>(/«, . b„  ) )  ~  .098 

so  that  the  variance  has  been  reduced  by  a  factor  of  about  2.13. 

This  is  in  fact  a  general  result:  the  variance  of  h  will  always  be  lower 
than  that  of  /.  as  we  shall  see  in  Section  5.9. 
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Figure  5.5:  The  <S  x  X  deterministically-interwired  network  of  Figure  4.1. 
The  left  box  contains  the  two  stages  simulated  in  the  first  example  of  Sec¬ 
tion  5.7.1:  the  right  box  contains  the  stage  solved  for  exactly. 


>  (estimate-loading-probability  d8x8-left-2  d8x8-right-l 
’ ( (tt6-o7-0  0)  (tt7-o7-0  0)) 

(make-clt-stopping-function  .01  .05  5000) 

’ (g-tt6-sink  h-tt6-sink  g-tt7-sink  h-tt7-sink)) 

Iteration  15;  mean:  0.7;  variance:  .151;  current  confidence  .056 
Iteration  5000;  mean:  .597;  variance:  0.1;  current  confidence  .819 
Iteration  10000;  mean:  .601;  variance:  .098;  current  confidence  .944 
100811/168112 
231602841/2354912896 
10507 


Figure  5.6:  Estimation  by  approximation  method  of  the  probability  that 
both  channels  leading  to  sink  7  in  the  network  of  Figure  5.5  carry  no  mes¬ 
sages.  under  uniform  addressing  and  with  a  source  transmission  probability 
of  1/2.  Compare  to  Figure  5.2. 
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Figure  5.7:  The  16  x  16  randomly-interwired  network  of  Figure  4.8.  The 
network  is  from  [2]. 

It  will  he  clear  from  Equations  (5.2)  and  (5.3)  that,  all  other  factors 
remaining  equal,  the  number  of  iterations  required  to  achieve  an  error  bound 
is  a  linear  function  of  the  variance  of  the  random  variable  whose  expectation 
is  being  estimated. 

Now  we  try  direct  simulation  and  our  approximation  method  on  ihe 
four-stage  16  x  16  network  of  Figure  4.8.  reproduced  in  Figure  5.7. 

We  see  in  Figure  5.8  the  results  of  using  direct  simulation  to  estimate 
the  probability  that  the  top  channel  leading  to  sink  0  in  the  network  of 
Figure  5.7  carries  no  messages.  In  Figure  5.9  we  see  the  results  of  using 
the  approximation  method  where  exact  calculation  is  used  for  only  the  final 
stage  of  the  network.  Finally,  in  Figure  5.10  we  see  the  results  of  using 
the  approximation  method  where  exact  calculation  is  used  for  the  final  two 
stages  of  the  network.  In  all  three  cases,  uniform  addressing  was  used,  with 
sources  having  transmission  probabilities  of  1/2. 

Where  direct  simulation  was  used,  the  variance  was  0.171:  where  exact 
calculation  was  used  for  only  the  final  stage,  the  variance  was  s:  0.072:  where 
exact  calculation  was  used  for  the  final  two  stages,  the  variance  was  ~  0.01N. 


\ 
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>  (simulate-multi-channel-loading-probability  rndl6xl6 
(elements-named  1 (ttO-oO-O)  rndl6xl6)  ’(0) 
(make-clt-stopping-function  .01  .05  5000)) 

Iteration  15;  mean:  0.8;  variance:  .171;  current  confidence  0.06 

Iteration  5000;  mean:  .778;  variance:  .173;  current  confidence  .815 

Iteration  10000;  mean:  0.78;  variance:  .172;  current  confidence  0.94 

8392/10737 

2459905/14409054 

10737 


Figure  5.X;  Estimating  the  probability  that  the  first  channel  leading  to  sink 
0  in  t he  network  of  Figure  5.7  carries  no  messages,  by  direct  simulation. 
10.757  iterations  were  required  to  achieve  the  error  bound  of  ±1  %  with 
95 ‘X  confidence.  Here  uniform  addressing  was  used,  with  sources  having 
transmission  probabilities  of  1/2. 


>  (estimate-loading-probability  rndl6xl6-left-3  rndl6xl6-right-l 
’ ( (ttO-oO-O  0)) 

(make-clt-stopping-f unction  .01  .05  5000) 

’(s-tt0-sink  t-tt0-sink)) 

Iteration  15;  mean:  0.75;  variance:  0.08;  current  confidence  .082 

7101/9076 

2976821/41177812 

4538 


Figure  5.9:  Estimating  the  probability  that  the  first  channel  leading  to  sink 
0  in  the  network  of  Figure  5.7  carries  no  messages,  by  approximation  where 
exact  calculation  is  used  for  only  the  final  stage  of  the  network.  1. 55*  itera¬ 
tions  were  required  to  achieve  the  error  bound  of  ±l(/f  with  95 CA  confidence. 
Uniform  addressing  was  used,  with  sources  having  transmission  probabilities 
of  1/2. 
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>  (estimate-loading-probability  rndl6xl6-left-2  rndi6xl6-right-2 
’ ( (ttO-oO-O  0)) 

(make-clt-stopping-f unction  .01  .05  5000) 

’(j-s-sink  k-s-sink  1-s-sink  m-s-sink 
j-  c-sink  k-t-sink  1-t-sink  m-t-sink)) 

Iteration  15;  mean:  .762;  variance:  0.02;  current  confidence  .164 

16577/21376 

935754207/51132760064 

1169 


Figure  5.10:  Estimating  t lie  probability  that  the  first  channel  leading  to  sink 
0  in  the  network  of  Figure  5.7  carries  no  messages,  bv  approximation  where 
exact  calculation  is  used  for  the  final  two  stages  of  the  network.  1. 109  itera¬ 
tions  were  required  to  achieve  the  error  bound  of  ± 1 %  with  95‘X  confidence. 
Uniform  addressing  was  used,  with  sources  having  transmission  probabilities 
of  1/2. 

We  see  then  that  by  using  exact  calculation  for  two  stages  of  this  network, 
we  reduce  the  number  of  iterations  necessary  by  a  factor  of  about  9.  In  the 
next  section  we  will  see  why  we  can  always  expect  lower  variance  from  h 
than  from  /. 

5.7.2  A  comparison  of  the  performance  of  three  networks 

We  present  three  example  networks,  all  taken  from  [2].  The  first  net¬ 
work.  shown  in  Figure  5.11.  is  constructed  from  two  non-dilated  four-stage 
networks  connecting  1(1  endpoints.  Because  the  degree  of  path-redundancy 
is  small  (there  are  only  two  paths  connecting  any  two  endpoints),  automatic 
calculation  of  the  exact  probability  of  successful  message  transmission  is 
feasible. 

The  second  network,  shown  in  Figure  5.12.  is  a  deterministically-interwired 
multipath  network  constructed  from  1  x  2.  dilation  2  crossbars,  and  2  x  2 
crossbars.  As  can  be  seen  in  the  figure,  multiple  paths  connect  any  two 
endpoints,  and  calculation  of  the  exact  probability  of  successful  message 
transmission  is  not  quickly  feasible  on  current  uniprocessor  workstations. 

The  third  network  is  the  randomly-interwired  multipath  network  of  Fig¬ 
ure  5.7.  Recall  that,  as  with  the  deterministically-interwired  network,  mul¬ 
tiple  paths  connect  any  two  endpoints,  and.  again,  exact  calculation  of  per- 
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Figure  5.11:  A  16  x  16  network  constructed  from  two  non-dilated  networks 
each  connecting  16  endpoints.  Redundant  paths  between  an  input  and  an 
output  are  shown.  The  figure  is  from  [2]. 
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P{Success} 


Figure  5.13:  The  probability  of  successful  message  transmission  is  shown 
for  each  of  the  three  networks  in  Figures  5.11.  -7.12.  ami  5.7.  The  results 
for  t he  replicated  network  are  shown  in  black,  and  are  exact:  the  results 
for  the  deterministically-interwired  network  are  shown  in  grey,  and  those 
for  the  randomly-interwired  network  are  shown  dashed.  See  the  text  for  a 
discussion  of  the  results. 

formance  parameters  is  too  expensive  to  be  feasible. 

The  performance  of  the  three  networks  can  nonetheless  be  compared  ef¬ 
fectively  using  the  exact  method  for  the  first  and  the  approximation  method 
for  the  second  and  third.  In  the  cases  where  the  approximation  method  was 
used,  we  have  specified  that  the  solution  must  lie  within  ±177  of  the  actual 
value  with  9 5 77  confidence. 

We  see  in  Figure  5.13  the  probability  of  successful  message  transmission 
for  each  of  the  three  networks,  and  in  Figure  5.1 1  the  bandwidth,  or  through¬ 
put.  for  each  of  the  three  networks.  As  was  also  found  in  [2]  (although  using  a 
much  more  complex  model),  the  deterministically-  and  randomly-interwired 
networks  perform  identically  to  within  the  resolution  of  the  approximation: 
and  the  replicated  network  performs  considerably  worse  than  either. 
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Throughput 


Figure  5.1-1:  The  bandwidth,  or  throughput,  is  shown  for  each  of  the  three 
networks  in  Figures  5.11.  5.12.  and  5.7.  The  results  for  the  replicated  net¬ 
work  are  shown  in  black,  and  are  exact:  the  results  for  the  deterministically- 
interwired  network  are  shown  in  grey,  and  those  for  the  randomly-interwired 
network  are  shown  dashed.  See  the  text  for  a  discussion  of  the  results. 
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5.8  Variance  of  Estimators  in  the  Approximation 
Method  and  in  Direct  Simulation 


A  simple  and  well-known  t  heorem  in  Monte  <  'arlo  met  hods  confirm*  what  we 
have  seen  in  onr  examples:  the  estimator  It  will  always  have  lower  variance 
t  han  will  t  he  estimator  /.  In  [10].  t  lie  t  heorem  is  paraphrased  as.  "if.  at  any 
point  of  a  Monte  Carlo  calculation,  we  can  replace  an  estimate  by  an  exact 
value,  we  shall  reduce  tin*  sampling  error  in  the  final  [(‘suit."  This  is  whv 
we  can  see  onr  method  of  approximating  the  exact  equations  as  a  means 
of  reducing  the  variance  of  the  simulation.  I  he  exact  equations  are  too 
expensive  to  solve  exactly  for  large  n  t works  with  many  dependent  paths, 
but  knowledge  and  use  of  the  exact  equations  on  a  subproblem  makes  it 
possible  for  us  to  realize  in  simulation  the  reduced  sampling  error  promised 
by  the  theorem. 

The  argument  in  [10]  is  short  enough  that  we  reproduce  it  here,  adapted 
to  our  particular  estimators. 

We  note  that  /(R)  and  //  (//-#, . I h,„  )  have  the  same  mean.  //.  Because 

/  is  binomial,  it  has  variance  //  ( 1  —  //).  The  variance  of  h  is  given  by 


Thus 


Ya  r(//)  =  i:  (//-'!  -  K[//]- 


Var (/)  -  Varf//) 


Now.  h.  being  a  loading  probability,  lies  in  the  interval  [0.1].  so  that  ev¬ 
erywhere  h  >  Ir .  But  h  takes  on  with  nonzero  probability  at  least  some 
values  that  are  not  0  or  1.  because  h  is  not  Bernoulli,  so  that  for  some  tuples 

{/ft, . I  nni ).  h  —  fr  >  0.  Thus  \'.{h  —  //-’]  >  0  and  Var  (/)  >  \’ar  (// ).  as  we 

desired  to  show. 


5.9  Expense  of  the  Approximation  Method 

One  is  tempted  by  the  results  of  Section  o.7. 1  to  ask  what  happens  if  we  again 
increase  the  number  ol  stages  for  which  li  performs  an  exact  calculation. 
Although  it  seems  likely  that  the  variance  would  be  reduced  further,  the 
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experiment  is  not  likely  to  he  worth  our  while,  as  the  network  for  which 
we  would  he  calculating  exactly  the  loading  probabilities  would  now  have1  a 
much  larger  number  of  redundant  paths  leading  from  its  sources  to  sink  0. 
I'll  its  we  would  he  faced  with  the  same  problem  of  exponential  growth  as 
before'. 

Our  method  can  only  reduce  the  expense'  of  simulation  so  much,  until  t  he 
exponential  growth  of  the  running  time  of  each  iteration  with  the'  number  of 
dependent  channels  dominates  the  savings  in  number  of  iterations.  In  fact, 
the  filial  stage  of  a  network,  considered  by  its*'lf.  will  always  constitute  a 
Banyan  network,  and  so  we  can  can  always  calculate  loading  probabilities 
across  it  at  the>  same'  asymptotic  expense  as  simulation  there  are  no  redun¬ 
dant  paths,  and  the  reduction  of  the  number  of  iterations  with  the  variance' 
will  be  realized  in  reduced  running  time. 

The  final  two  stage's  of  the  network  of  Figure  5.7  do  not  constitute  a 
Banyan  network,  but  the  number  of  redundant  paths  between  a  source  and 
a  sink  is  small  ( two),  anel  se>  in  this  case  the  running  time  is  also  significantly 
reduced.  In  many  types  of  multipath  networks  larger  final  subnetworks 
constitute  Banyan  networks  or  have  small  numbers  of  dependent  channels: 
in  these  networks  it  will  be  profitable  to  use  exact  calculation  for  more  than 
one  final  stage. 

In  a  network  with  .V  stages  with  M  channels  between  each  stage,  if  exact 
calculation  is  used  for  the  final  A  stages,  then  in  the  worst  case,  where  the 
load  on  every  channel  between  two  stages  of  switches  in  the  final  A  stages 
is  dependent  on  the  loads  on  the  other  channels  between  those  two  stages 
of  switches,  the  running  time  of  exact  calculation  for  the  final  stages  will 
be  O^A  There  will  be  .V  —  A  stages  simulated,  at  an  expense  of 

()(( .V  —  A  )  ,\l)  steps  per  simulation,  so  that  the  worst-case  performance  will 
be 


0 


+ 


where  c  is  the  complement  of  the  desired  confidence:  <1  is  the  deviation  factor. 
//  is  the  mean  and  cr~  the  variance  ol  h.  the  result  of  exact  calculation. 

The  worst-case  result  is  misleading,  however,  because1  in  networks  built  in 
practice,  t  lie  subnet  works  const  it  uted  by  final  stages  have  smaller  numbers  of 
dependent  channels  than  does  t lie  entire  network.  In  fact,  if  the1  final  stages 
for  which  exact  calculation  is  performed  constitute  a  Banyan  network,  then 
t  lie  ru lining  t  ime  of  exact  calculat  ion  is  (X  A  .1/  ).  and  t  lie  asymptot  ic  running 
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time  of  t li«»  approximation  is  simply 


0 


wlioro  onoo  again  <  is  tho  confidence  complement.  d  tho  doviation  lad  or.  // 
tho  mean  and  a1  the  variance  of  h. 

5.10  Conclusions 


\\'e  have  developed  methods  of  calculating  tho  value  of  some  performance 
parameters  for  multistage  networks  the  normalized  throughput  and  prob¬ 
ability  of  successful  message  transmission  by  computing  the  loading  prob¬ 
abilities  of  channels  leading  to  sinks. 

We  showed  initially  that  independence  of  loads  on  channels  in  a  Banyan 
network  allows  a  simple  means  of  calculating  channel  loading  probabilities 
for  these  networks,  and  described  a  way  of  composing  operations  on  loading 
probability  mass  functions  to  derive  expressions  for  the  performance  param¬ 
eters.  We  presented  a  program  that  derived  such  expressions  and  could  be 
used  for  numerical  calculation  of  performance  parameters. 

We  then  saw  that  independence  of  loads  on  channels  will  not  hold  in 
multipath  networks,  and  developed  equations  for  channel  loading  probabil¬ 
ities  in  these  networks.  We  showed  that  tho  number  of  equations  that  must 
be  solved  by  this  method  is  exponential  in  the  number  of  dependent  paths 
in  the  network,  rendering  the  method  impractical  for  large  networks.  We 
presented  a  program  that  could  be  used  to  calculate  channel  loading  prob¬ 
abilities  exactly  for  smaii  networks,  and  discussed  its  performance  in  the 
cases  of  multipath  networks  and  Banyan  networks. 

We  developed  a  method  of  approximate  solution  of  the  exact  equations, 
and  compared  its  performance  to  that  of  direct  simulation.  We  developed 
programs  for  both  our  approximation  method  and  direct  simulation.  We 
saw  that  use  of  the  exact  equations  will  always  afford  some  improvement  in 
performance,  by  reducing  tho  variance  of  t he  estimator  in  question:  and  we 
discussed  cases  where  the  reduction  in  running  time  will  be  quite  substantial. 


5.11  Future  Work 

The  literature  on  Monte  Carlo  methods  contains  many  techniques  for  re¬ 
ducing  th«'  variance  of  estimators.  Some  of  these  are  particularly  promising 
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for  our  application.  For  example,  the  use  of  stratified  sampling,  where  the 
strata  are  sef» related  by  the  number  of  messages  generated  by  sources  in 
a  particular  cycle,  should  be  easy  to  implement  and  promises  a  significant 
reduction  in  variance'. 

We  look  forward  to  comparing  more  results  of  the  application  ol  these 
met  hods  to  the  results  of  mitre  faithlul  and  complex  simulations  performed 
at  M.I.T.'s  Transit  (iroup.  The  aim  of  the  Transit  (iroup's  simulations  is  to 
select  a  network  si  ruct  ure  for  implementation  in  a  large-scale  multiprocessor. 
We  expect  from  the  results  cited  in  Section  1.1  that  our  model  will  be  useful 
in  comparing  candidate  networks. 


Appendix  A 

Mathematica  Procedures  for 
Modelling  Banyan  Networks 

concentrate: : usage  = 

"concentrate [x,  n]  concentrates  the  LPMF  x  to  n  channels." 
concentrate [x_ ,  n_J  := 

(*  get  distribution  for  0  through  n-1  channels,  and  add 
as  last  element  the  sum  of  the  rest  of  the  channels.  *) 
Append  [Take [x,  n] ,  Apply [Plus,  Drop[x,  n]]] 

discreteconvolution: : usage  = 

"discreteconvolution[x,  y]  treats  x  and  y  as  O-based 
vectors  and  returns  their  discrete  convolution." 

discreteconvolution[x_,  y_]  := 

Block  [-(xlgth,  ylgth,  lgth}, 
xlgth  =  Length  [x] ; 
ylgth  =  Length [y] ; 
lgth  =  xlgth  +  ylgth  -  1 ; 

(*  in  summation,  portions  of  sequence  with  indices 
out  of  range  for  sequences  must  be  treated  as 
0.  *) 

Table [Sum [If [k  <1  l|  k  >  xlgth  II 

(n-k+1)  <  1  I  I  Cn-k+1)  >  ylgth, 

0, 

(*  because  of  the  0->l  index 
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translation,  we  increase  the  y-index 
to  shift  the  result  sequence  back 
down  to  begin  at  1.  *) 
x [ [k] ]  y[[n-k+l]]], 

{k,  xlgth}] , 

{n,  lgth>]] 

bundle :: usage  = 

"bundle  [x,  y]  forms  the  LPMF  that  results  from  bundling 
two  input  bundles  with  LPMFs  x  and  y." 

bundle [x_,  y_]  := 

discreteconvolution[x ,  y] 


switch: : usage  = 

"switch [x,  p]  returns  the  LPMF  of  an  output  bundle  to 
which  x  is  switched  with  probability  p." 

switch [x_,  p_]  := 

Block [{lgth}, 

lgth  =  Length  [x] ; 

Table[Sum[x[[i+l]]  Binomiai[i,  n]  p~n  (l-p)"'(i-n) , 
-Ci,  n,  lgth-1}]  , 

{n,  0,  lgth-l>]] 
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